Kernel panic ryzen 9 7950x on reboot or shutdown

Anyone faced an ongoing kernel panic on Rocky Linux 9.1 using AMD Ryzen 9 CPUs?

The machine will commonly crash, although most frequently occurs on shutdown or reboot. On shutdown, the kernel will panic and the machine will eventually just reboot. On reboot, same behaviour, it just takes a long time

uname -r
5.14.0-162.18.1.el9_1.x86_64

cat /etc/os-release
NAME=“Rocky Linux”
VERSION=“9.1 (Blue Onyx)”

the screen outputs this information for about a minute, and then the machine will reboot

vmcore-dmesg.txt =


[87505.449080] Call Trace:
[87505.449543]  tb_nvm_free+0x17/0x50
[87505.450018]  tb_switch_remove+0x120/0x1b0
[87505.450480]  icm_stop+0x21/0x60
[87505.450939]  tb_domain_remove+0x32/0x60
[87505.451389]  nhi_remove+0x47/0x60
[87505.451830]  pci_device_shutdown+0x34/0x60
[87505.452255]  device_shutdown+0x159/0x1b0
[87505.452686]  __do_sys_reboot.cold+0x2f/0x5b
[87505.453116]  ? devkmsg_write.cold+0x24/0x48
[87505.453570]  ? do_iter_readv_writev+0x152/0x1b0
[87505.454003]  ? vfs_writev+0xcb/0x170
[87505.454435]  do_syscall_64+0x5c/0x90
[87505.454860]  ? do_syscall_64+0x69/0x90
[87505.455292]  ? do_writev+0x6f/0x120
[87505.455723]  ? syscall_exit_to_user_mode+0x12/0x30
[87505.456158]  ? do_syscall_64+0x69/0x90
[87505.456595]  ? exc_page_fault+0x62/0x150
[87505.457032]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[87505.457468] RIP: 0033:0x7fba0ad15187
[87505.457905] Code: 0b 00 f7 d8 64 89 02 b8 ff ff ff ff eb b8 0f 1f 44 00 00 f3 0f 1e fa 89 fa be 69 19 12 28 bf ad de e1 fe b8 a9 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 01 c3 48 8b 15 69 4c 0b 00 f7 d8 64 89 02 b8
[87505.458846] RSP: 002b:00007ffd14ab2898 EFLAGS: 00000202 ORIG_RAX: 00000000000000a9
[87505.459326] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fba0ad15187
[87505.459809] RDX: 0000000001234567 RSI: 0000000028121969 RDI: 00000000fee1dead
[87505.460291] RBP: 00007ffd14ab29f0 R08: 0000000000000000 R09: 00007ffd14ab1c90
[87505.460773] R10: 00007ffd14ab1e50 R11: 0000000000000202 R12: 0000000000000001
[87505.461254] R13: 00007ffd14ab28f0 R14: 000055c389a2fef0 R15: 000055c388d1f07c
[87505.461725] Modules linked in: snd_seq_dummy snd_hrtimer tls rpcsec_gss_krb5 nls_utf8 nfsv4 cifs cifs_arc4 nfs rdma_cm iw_cm ib_cm lockd grace ib_core fscache netfs cifs_md4 dns_resolver nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink qrtr bnep vfat fat nvidia_drm(POE) nvidia_modeset(POE) nvidia_uvm(POE) nvidia(POE) snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi mt7921e snd_hda_intel intel_rapl_msr intel_rapl_common snd_intel_dspcfg snd_intel_sdw_acpi mt7921_common snd_hda_codec mt76_connac_lib mt76 snd_hda_core btusb edac_mce_amd btrtl snd_hwdep mac80211 btbcm snd_seq kvm_amd btintel snd_seq_device libarc4 btmtk snd_pcm kvm bluetooth cfg80211 drm_kms_helper asus_nb_wmi eeepc_wmi irqbypass asus_wmi snd_timer syscopyarea joydev rapl sparse_keymap wmi_bmof intel_wmi_thunderbolt pcspkr i2c_piix4 sysfillrect
[87505.461755]  snd rfkill sysimgblt soundcore fb_sys_fops i2c_designware_platform acpi_cpufreq i2c_designware_core auth_rpcgss sunrpc drm xfs libcrc32c ahci libahci crct10dif_pclmul crc32_pclmul crc32c_intel libata atlantic nvme ghash_clmulni_intel ccp igc sp5100_tco nvme_core macsec t10_pi wmi video dm_mirror dm_region_hash dm_log dm_mod fuse
[87505.467527] CR2: 000000000000031c

It may be that you have to play around with iommu parameter options to the kernel command line to overcome this with the distro default kernel.

The other option is to install a more recent kernel from elrepo. I’m not sure that is the correct source. I have an older ryzen processor that through a lot of errors when using rl8, not kernel panics, and when I switched to rl9 they went away which I assume is due to the more recent kernel version.