Network installation 8.9 stuck

Hi,

just tried to install rocky 8.9 by duplicating the ipxe + kickstart stuff from 8.8 installations, but it is stuck.
The ipxe file reads

set version 8.9
set target ibrocky89-clear-sda.cfg ##  <- kickstart file
set route 10.20.0.0/16:10.20.0.1:ib0
set dns 10.10.24.192
set gateway 10.20.0.1
set netmask 255.255.0.0

set repo ${mirror}/rocky/${version}/BaseOS/x86_64/os/
set kickstart ${http}/rocky/kickstart/${target}
kernel ${repo}/images/pxeboot/vmlinuz initrd=initrd.img inst.repo=${repo} inst.text inst.sshd inst.ks=${kickstart} ip=${net0.dhcp/ip}::${gateway}:${netmask}:${net0.dhcp/hostname}.gsi.de:ib0:off ks.device=ib0 rd.driver.post=mlx4_ib,mlx5_ib,ib_ipoib,ib_um\
ad,rdma_ucm rd.neednet=1 rd.net.timeout.carrier=600 nameserver=${dns} rd.route=${route}
initrd ${repo}/images/pxeboot/initrd.img
boot || goto shell

This works as long as the version is 8.8, but with 8.9, the machine starts up, pulls the kernel and initrd from the boot server,
runs through the boot process until it hits

OK ] Reachted target Network (Pre).
OK ] Started cancel waiting for multipath siblings of sda.
IPv6: ADDRCONF(NETDEV_UP): ib0: link is not ready
IPv6: ADDRCONF(NETDEV_CHANGE): ib0: link becomes ready

which is the following by an endless repetition of

 IPv6: ADDRCONF(NETDEV_UP): enp17s0f0/1/2/3: link is not ready

The latter message is correct, these Ethernet interfaces are not connected.
But the Infiniband interface ib0 is connected and working.

A rocky 8.8 installation will also take a 5 min break at this stage, but then continue.

Are there any changes between 8.8 and 8.9 in bringing the network devices up, within the initrd?

Cheers
Thomas

I looked up what happens at this stage in a successful rocky 8.8 installation:

INFO NetworkManager:<info>  [1705062018.4710] device (ib0): carrier: link connected
INFO NetworkManager:<info>  [1705062018.4712] device (ib0): state change: unavailable -> disconnected (reason 'carrier-changed', sys-iface-state: 'managed')
INFO NetworkManager:<info>  [1705062018.4715] policy: auto-activating connection 'ib0' (b36dea13-3639-4e72-b587-97fcdfc23179)
INFO NetworkManager:<info>  [1705062018.4720] device (ib0): Activation: starting connection 'ib0' (b36dea13-3639-4e72-b587-97fcdfc23179)
INFO NetworkManager:<info>  [1705062018.4720] device (ib0): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'managed')
12:20:18,472 INFO NetworkManager:<info>  [1705062018.4721] manager: NetworkManager state is now CONNECTING
INFO kernel:IPv6: ADDRCONF(NETDEV_UP): ib0: link is not ready
INFO NetworkManager:<info>  [1705062019.2019] device (ib0): state change: prepare -> config (reason 'none', sys-iface-state: 'managed')
INFO NetworkManager:<info>  [1705062019.2024] device (ib0): state change: config -> ip-config (reason 'none', sys-iface-state: 'managed')
INFO NetworkManager:<info>  [1705062019.2027] policy: set 'ib0' (ib0) as default for IPv4 routing and DNS
INFO NetworkManager:<info>  [1705062019.2098] device (ib0): carrier: link connected
INFO NetworkManager:<info>  [1705062019.2099] device (ib0): state change: ip-config -> ip-check (reason 'none', sys-iface-state: 'managed')
INFO NetworkManager:<info>  [1705062019.2100] device (ib0): state change: ip-check -> secondaries (reason 'none', sys-iface-state: 'managed')
INFO NetworkManager:<info>  [1705062019.2100] device (ib0): state change: secondaries -> activated (reason 'none', sys-iface-state: 'managed')
INFO NetworkManager:<info>  [1705062019.2101] manager: NetworkManager state is now CONNECTED_SITE
INFO NetworkManager:<info>  [1705062019.2102] device (ib0): Activation: successful, device activated.
12:20:19,210 INFO NetworkManager:<info>  [1705062019.2103] manager: NetworkManager state is now CONNECTED_GLOBAL
INFO NetworkManager:<info>  [1705062571.5144] manager: startup complete
INFO NetworkManager:<info>  [1705062571.5144] quitting now that startup is complete
INFO NetworkManager:<info>  [1705062571.5155] manager: NetworkManager state is now CONNECTED_SITE
INFO NetworkManager:<info>  [1705062571.5156] exiting (success)
INFO dracut-initqueue:anaconda: stage2 locations are: http://10.20.3.214/rocky/8.8/BaseOS/x86_64/os/
 INFO dracut-initqueue:anaconda: fetching stage2 from http://10.20.3.214/rocky/8.8/BaseOS/x86_64/os/

So, no surprises: The NetworkManager in the initrd finally kicks in, gets the network interface up and anaconda can take over.

Solved it by trial and error.
The reworked ipxe file reads

set repo ${mirror}/rocky/${version}/BaseOS/x86_64/os/
set kickstart inst.ks.device=ib0 inst.ks=${http}/rocky/kickstart/${target}

set route 10.20.0.0/16:10.20.0.1:ib0
set dns 10.10.24.192
set gateway 10.20.0.1
set netmask 255.255.0.0
set ip_config ip=${net0.dhcp/ip}::${gateway}:${netmask}:${net0.dhcp/hostname}.gsi.de:ib0:off nameserver=${dns} rd.route=${route} ipv6.disable=1

set install_options inst.repo=${repo} inst.text inst.sshd

kernel ${repo}/images/pxeboot/vmlinuz initrd=initrd.img ${ip_config} ${install_options} ${kickstart}
initrd ${repo}/images/pxeboot/initrd.img

So I have removed the explicit call for IB kernel modules and the timout, and the kickstart device is now a *inst.*ks.device.

I just wonder if the other variables should get the inst.-prefix, too.
As in inst.ip or inst.nameserver

No. Look at Chapter 17. Boot options Red Hat Enterprise Linux 8 | Red Hat Customer Portal