Unpredictable predictable interface naming

Hi there,

I’m a bit puzzled by what I’m seeing. We’re doing lots of automated installs via Kickstart, and the predictive interface names during Kickstart don’t match the names that are presented by the installed system.

For instance, I’ve got the following four interface names present during Kickstart:

enp75s0f0
enp75s0f1
enp152s0f0
enp152s0f0

and yet when Rocky boots for normal usage, somehow the interface names now become:

enp75s0f0np0
enp75s0f1np1
enp152s0f0np0
enp152s0f1np1

I’m assuming the np# portion refers to a physical port on the network card, but why is that name not used during Kickstart?

Or maybe the question should be, why does it become the interface name under Rocky proper? e.g., why do I see this in dmesg:

[    3.241385] usbcore: registered new interface driver cdc_ether
[    6.892182] mlx5_core 0000:98:00.1 enp152s0f1: renamed from eth3
[    6.904849] mlx5_core 0000:b1:00.0 enp177s0f0: renamed from eth4
[    6.925118] mlx5_core 0000:4b:00.1 enp75s0f1: renamed from eth1
[    6.947146] mlx5_core 0000:b1:00.1 enp177s0f1: renamed from eth5
[    6.964074] mlx5_core 0000:4b:00.0 enp75s0f0: renamed from eth0
[    6.986098] mlx5_core 0000:98:00.0 enp152s0f0: renamed from eth2
[   33.513288] mlx5_core 0000:98:00.1 enp152s0f1np1: renamed from eth3
[   33.592246] mlx5_core 0000:4b:00.0 enp75s0f0np0: renamed from eth0
[   33.620145] mlx5_core 0000:b1:00.1 enp177s0f1np1: renamed from eth5
[   33.638210] mlx5_core 0000:98:00.0 enp152s0f0np0: renamed from eth2
[   33.652080] mlx5_core 0000:b1:00.0 enp177s0f0np0: renamed from eth4
[   33.694620] mlx5_core 0000:4b:00.1 enp75s0f1np1: renamed from eth1

That’s a lot of volatility for a network interface name, and I’m genuinely confused as to which naming convention to code for – especially given that systemd interface names are supposed to be stable.

If anyone can help shed some light here, it would be greatly appreciated, because we’re not seeing similiar behavior from any of the other Rocky builds so far – although in fairness, the hosts in question here are not using bonding and everything else is.

thanks,
Klaus

Did you try to use following kernel cmdline options:

biosdevname=0 net.ifnames=0

Historically that’s what we’ve always done, but it’s not an option here since each interface has its own specific configuration, and we can’t take chances with kernel enumeration suddenly deciding to swap the logical names of eth0 and eth2, so we have to stick with systemd predictive naming.

In other words, Rocky kernel boots and you do get one set of names,
but when Rocky kernel boots you do get different set of names?

The first kernel (that runs the installer) does not have clearly different command-line parameters.
It does have clearly different initramfs.

I presume this is from “regular boot”. It is scary; same interface is renamed twice and there was long time between the two.

Run journalctl. Seek the renamed messages. Around the first (and only) set of renames my journal shows

systemd-udevd.*:  Using default interface naming scheme 'rhel-9.1'.

What do you have around the two?

I ran into the same problem on my Dell PowerEdge R540 with 6 NICs.
Devices renaming was very inconsistent.
Sometimes not all were renamed at all and still had an ethX device name.
I tried disabling kernel renaming (ie. biosdevname=0 net.ifnames=0).
That still lead to inconsistent naming (ie. after a reboot eth2 was now eth5, etc)

I finally went with systemd rules based upon MAC address.

mkdir /etc/systemd/network
vi /etc/systemd/network/00-match-mac.link

[Match]
MACAddress=aa:bb:cc:dd:ee:ff

[Link]
Name=netxxx

That would result in device with MAC aa:bb:cc:dd:ee:ff always have the device name netxxx.

I did that for each interface.
I also had to fix the NM connections in /etc/NetworkManager/system-connections to reflect the changes.

The installer generates NM connections. Those bind to interfaces by MAC; name is irrelevant. (NM hands the current name to FirewallD.)

When you define connection with nmcli, you have to give ifname. You can update the conn later to bind to MAC rather than ifname.

Unfortunately, I ended up just redoing the install to see if that eliminated the behavior, so it remains to be seen if it happens again.

It’s still a big question mark for me, though, because we have other hardware acquired around the same time that also presents interfaces with np# in the name, but it does this already during Kickstart so the problem of the drifting interface name doesn’t come up on those systems.

The other possibility (although it seems like a remote one to me) is that this is caused by some underlying hardware issue. The servers in question are part of a small batch of ten, seven of which are currently cycling through hardware enumeration indefinitely and are not able to POST. Two of these seven exhibited the drifting interface name behavior earlier, so I reinstalled them and they got through Kickstart no problem but went to enumeration purgatory immediately after the post-Kickstart reboot. :-/

Time to watch and wait, I guess.

cheers,
Klaus

I have various servers with various cards. I’ve seen many types of names, including
eno1, ibp59s0, ens5f0, and enp8s0f0 on el9.

I do have eno1np0 on el8 – a Dell R640, where I enabled “partitions” in BIOS.
Each of two physical ports (np0 and np1) shows as 8 interfaces (with unique enpN prefix).
I even tested SR-IOV on top of those. Interfaces galore …

Red Hat explains some of the names in https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/html/configuring_and_managing_networking/consistent-network-interface-device-naming_configuring-and-managing-networking

Hmmm, in that case … is it possible then that these servers shipped from the manufacturer with NIC partitioning enabled in the BIOS when the OS was initially provisioned and the Kickstart kernel/stack doesn’t make the distinction, but the actual OS kernel/stack does?

Are systemd or NetworkManager permitted to rename interfaces on their own after the OS finishes booting? (removable interfaces and deliberate renames via ip notwithstanding)

cheers,
Klaus

That is hard to believe.

Your log did show that interface was renamed twice, but the info about who did it was not included.

Both udev and systemd can rename interfaces and so does ip. Possibly some others too. Whodunnit? Why that who? What did set the extra to do the deed?

Yeah, I would have been incredibly surprised if both kernels handled this differently, so I sure you’re right about that.

As far as interface renaming, it would had to have been either udev or systemd, since the machines in question are not yet being used, and login access is highly restricted.

cheers,
Klaus

It doesn’t sound like it, but are you installing Mellanox OFED in the Kickstart? I’ve seen it change interface names on me.

No, we’re not, but these population have lots of Mellanox, so that might be worth considering for future builds. Thanks for the tip!

cheers,
Klaus

Okay, hopefully there are still a few people following this thread, but it happened again, and it appears that NetworkManager is to blame:

/var/log/messages-20230423:Apr 19 17:55:51 my_hostname kernel: igb 0000:86:00.1 enp134s0f1: renamed from eth1
/var/log/messages-20230423:Apr 19 17:55:51 my_hostname kernel: igb 0000:86:00.0 enp134s0f0: renamed from eth0
/var/log/messages-20230423:Apr 19 17:55:54 my_hostname kernel: mlx5_core 0000:18:00.1 eno2: renamed from eth1
/var/log/messages-20230423:Apr 19 17:55:54 my_hostname kernel: mlx5_core 0000:18:00.0 eno1: renamed from eth0
/var/log/messages-20230423:Apr 19 17:55:54 my_hostname kernel: mlx5_core 0000:af:00.0 enp175s0f0: renamed from eth2
/var/log/messages-20230423:Apr 19 17:55:54 my_hostname kernel: mlx5_core 0000:af:00.1 enp175s0f1: renamed from eth3
/var/log/messages-20230423:Apr 21 19:10:16 my_hostname kernel: igb 0000:86:00.0 enp134s0f0: renamed from eth0
/var/log/messages-20230423:Apr 21 19:10:16 my_hostname kernel: igb 0000:86:00.1 enp134s0f1: renamed from eth1
/var/log/messages-20230423:Apr 21 19:10:19 my_hostname kernel: mlx5_core 0000:18:00.0 eno1: renamed from eth0
/var/log/messages-20230423:Apr 21 19:10:19 my_hostname kernel: mlx5_core 0000:18:00.1 eno2: renamed from eth1
/var/log/messages-20230423:Apr 21 19:10:19 my_hostname kernel: mlx5_core 0000:af:00.0 enp175s0f0: renamed from eth2
/var/log/messages-20230423:Apr 21 19:10:19 my_hostname kernel: mlx5_core 0000:af:00.1 enp175s0f1: renamed from eth3
/var/log/messages:Apr 24 17:03:21 my_hostname kernel: mlx5_core 0000:af:00.0 enp175s0f0np0: renamed from eth2
/var/log/messages:Apr 24 17:03:21 my_hostname NetworkManager[2237]: <info>  [1682355801.0372] device (eth2): interface index 11 renamed iface from 'eth2' to 'enp175s0f0np0'
/var/log/messages:Apr 24 17:03:21 my_hostname kernel: mlx5_core 0000:18:00.1 eno2np1: renamed from eth1
/var/log/messages:Apr 24 17:03:21 my_hostname kernel: mlx5_core 0000:af:00.1 enp175s0f1np1: renamed from eth3
/var/log/messages:Apr 24 17:03:22 my_hostname kernel: mlx5_core 0000:18:00.0 eno1np0: renamed from eth0
/var/log/messages:Apr 24 17:03:22 my_hostname NetworkManager[2237]: <info>  [1682355802.0545] device (eth1): interface index 10 renamed iface from 'eth1' to 'eno2np1'
/var/log/messages:Apr 24 17:03:22 my_hostname NetworkManager[2237]: <info>  [1682355802.0548] device (eth3): interface index 12 renamed iface from 'eth3' to 'enp175s0f1np1'
/var/log/messages:Apr 24 17:03:22 my_hostname NetworkManager[2237]: <info>  [1682355802.0550] device (eth0): interface index 9 renamed iface from 'eth0' to 'eno1np0'

That tells the story of what happened, but I have no idea why it happened, or why NetworkManager decided to rename everything, apparently spontaneously. Hoping someone reading this has been in this situation before, I’m pretty puzzled as to what happened.

cheers,
Klaus

Bah, Mellanox strikes again. Another team installed MLNX drivers, which resulted in the interface names changing.