tbuck
November 17, 2022, 6:46pm
1
Hey,
I just updated my supermicro server running Rocky 8.6 to 8.7.
The server OS is running on a 240G NVME drive (no RAID). It has a Seagate -4U106 JBOD attached via mini SAS3 and is multipathed. The JBOD is setup with zfs filesystem (KMOD selection) . Initial configuration of the zpool and raidz was all good. After update this morning the system booted to grub loader and after that never to the OS. I am however able to boot to the 4.18.0-372.32.1.el8_6.x86_64 kernel (with the JBOD connected and powered up and with out the JBOD powered up). The prior one that was before the update.
Any ideas on what may be the issue would be greatly appreciated…
Thanks!
This was previously reported by someone else as well. Take a look here:
opened 07:07PM - 16 Nov 22 UTC
Type: Defect
### System information
Type | Version/Name
--- | ---
Distribution Name | R… ocky Linux
Distribution Version | 8.7
Kernel Version | 4.18.0-425.3.1.el8
Architecture | x86_64
OpenZFS Version | 2.1.6
### Describe the problem you're observing
kmod-zfs-2.1.6-1.el8.x86_64 fails to load and generates a CPU soft lockup
### Describe how to reproduce the problem
Boot without ZFS sinstalled
<pre>
[root@stashcache ~]# yum install zfs
ZFS on Linux for EL8 KMOD 2.9 MB/s | 3.0 kB 00:00
Dependencies resolved.
=============================================================================================================
Package Architecture Version Repository Size
=============================================================================================================
Installing:
zfs x86_64 2.1.6-1.el8 zfs-kmod 660 k
Installing dependencies:
kmod-zfs x86_64 2.1.6-1.el8 zfs-kmod 1.5 M
libnvpair3 x86_64 2.1.6-1.el8 zfs-kmod 37 k
libuutil3 x86_64 2.1.6-1.el8 zfs-kmod 32 k
libzfs5 x86_64 2.1.6-1.el8 zfs-kmod 230 k
libzpool5 x86_64 2.1.6-1.el8 zfs-kmod 1.3 M
Transaction Summary
=============================================================================================================
Install 6 Packages
Total size: 3.7 M
Installed size: 14 M
Is this ok [y/N]: y
Downloading Packages:
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
Preparing : 1/1
Installing : libnvpair3-2.1.6-1.el8.x86_64 1/6
Installing : libuutil3-2.1.6-1.el8.x86_64 2/6
Installing : libzfs5-2.1.6-1.el8.x86_64 3/6
Installing : libzpool5-2.1.6-1.el8.x86_64 4/6
Installing : zfs-2.1.6-1.el8.x86_64 5/6
Running scriptlet: zfs-2.1.6-1.el8.x86_64 5/6
Installing : kmod-zfs-2.1.6-1.el8.x86_64 6/6
Running scriptlet: kmod-zfs-2.1.6-1.el8.x86_64 6/6
Running scriptlet: zfs-2.1.6-1.el8.x86_64 6/6
Running scriptlet: kmod-zfs-2.1.6-1.el8.x86_64 6/6
Verifying : kmod-zfs-2.1.6-1.el8.x86_64 1/6
Verifying : libnvpair3-2.1.6-1.el8.x86_64 2/6
Verifying : libuutil3-2.1.6-1.el8.x86_64 3/6
Verifying : libzfs5-2.1.6-1.el8.x86_64 4/6
Verifying : libzpool5-2.1.6-1.el8.x86_64 5/6
Verifying : zfs-2.1.6-1.el8.x86_64 6/6
Installed:
kmod-zfs-2.1.6-1.el8.x86_64 libnvpair3-2.1.6-1.el8.x86_64 libuutil3-2.1.6-1.el8.x86_64
libzfs5-2.1.6-1.el8.x86_64 libzpool5-2.1.6-1.el8.x86_64 zfs-2.1.6-1.el8.x86_64
Complete!
</pre>
<pre>
root@stashcache ~]# modprobe zfs
Message from syslogd@stashcache.ldas.cit at Nov 16 10:59:12 ...
kernel:watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [modprobe:80625]
Message from syslogd@stashcache.ldas.cit at Nov 16 10:59:40 ...
kernel:watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [modprobe:80625]
...
</pre>
### Include any warning/errors/backtraces from the system logs
<pre>
[root@stashcache ~]# tail -f /var/log/messages
...
Nov 16 10:59:12 stashcache.ldas.cit kernel: watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [modprobe:80625]
Nov 16 10:59:12 stashcache.ldas.cit kernel: Modules linked in: spl(OE+) nfsv3 nfs_acl nfs lockd grace fscache 8021q garp mrp stp llc nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nf_tables_set nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink sunrpc intel_rapl_msr intel_rapl_common isst_if_common skx_edac iTCO_wdt iTCO_vendor_support x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl ipmi_ssif intel_cstate mlx5_ib ib_uverbs pcspkr ib_core intel_uncore joydev mei_me i2c_i801 lpc_ich mei ioatdma acpi_ipmi ipmi_si dax_pmem_compat ipmi_devintf device_dax ipmi_msghandler dax_pmem_core acpi_pad acpi_power_meter binfmt_misc xfs libcrc32c raid1 nd_pmem nd_btt sd_mod sg mlx5_core ast i2c_algo_bit drm_vram_helper drm_kms_helper syscopyarea sysfillrect mpt3sas sysimgblt fb_sys_fops drm_ttm_helper ttm ahci nvme raid_class libahci mlxfw pci_hyperv_intf
Nov 16 10:59:12 stashcache.ldas.cit kernel: drm ixgbe scsi_transport_sas nvme_core crc32c_intel libata tls t10_pi psample mdio dca nfit libnvdimm dm_mirror dm_region_hash dm_log dm_mod [last unloaded: zunicode]
Nov 16 10:59:12 stashcache.ldas.cit kernel: CPU: 0 PID: 80625 Comm: modprobe Kdump: loaded Tainted: P OE --------- - - 4.18.0-425.3.1.el8.x86_64 #1
Nov 16 10:59:12 stashcache.ldas.cit kernel: Hardware name: Supermicro SYS-2029U-TN24R4T/X11DPU, BIOS 3.8 08/19/2022
Nov 16 10:59:12 stashcache.ldas.cit kernel: RIP: 0010:native_queued_spin_lock_slowpath+0x17b/0x1c0
Nov 16 10:59:12 stashcache.ldas.cit kernel: Code: 74 22 48 89 c1 0f 0d 08 eb 20 f3 90 8b 07 85 c0 75 f8 f0 0f b1 17 75 f2 65 ff 0d 6c 5e 0d 75 e9 1b b6 aa 00 31 c9 eb 02 f3 90 <8b> 07 66 85 c0 75 f7 41 89 c0 66 45 31 c0 44 39 c6 74 20 c6 07 01
Nov 16 10:59:12 stashcache.ldas.cit kernel: RSP: 0018:ffff9b606e40fc30 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff13
Nov 16 10:59:12 stashcache.ldas.cit kernel: RAX: 0000000000040001 RBX: 0000000000000002 RCX: 0000000000000000
Nov 16 10:59:12 stashcache.ldas.cit kernel: RDX: ffff8ccd8082bcc0 RSI: 0000000000040000 RDI: ffff8c9f079b3cd0
Nov 16 10:59:12 stashcache.ldas.cit kernel: RBP: ffff8c9f079b3cd0 R08: ffff9b606e40fbd8 R09: ffff8c9f079b2000
Nov 16 10:59:12 stashcache.ldas.cit kernel: R10: ffff8c9f410a6b40 R11: 0000000000000246 R12: 0000000000400001
Nov 16 10:59:12 stashcache.ldas.cit kernel: R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000008000
Nov 16 10:59:12 stashcache.ldas.cit kernel: FS: 00007f4272058740(0000) GS:ffff8ccd80800000(0000) knlGS:0000000000000000
Nov 16 10:59:12 stashcache.ldas.cit kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 16 10:59:12 stashcache.ldas.cit kernel: CR2: 00007f4271034f80 CR3: 000000010f4be003 CR4: 00000000007706f0
Nov 16 10:59:12 stashcache.ldas.cit kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Nov 16 10:59:12 stashcache.ldas.cit kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Nov 16 10:59:12 stashcache.ldas.cit kernel: PKRU: 55555554
Nov 16 10:59:12 stashcache.ldas.cit kernel: Call Trace:
Nov 16 10:59:12 stashcache.ldas.cit kernel: tsd_hash_search.isra.4+0x7e/0x90 [spl]
Nov 16 10:59:12 stashcache.ldas.cit kernel: tsd_create+0x8b/0x160 [spl]
Nov 16 10:59:12 stashcache.ldas.cit kernel: ? 0xffffffffc0790000
Nov 16 10:59:12 stashcache.ldas.cit kernel: spl_taskq_init+0x2d/0x180 [spl]
Nov 16 10:59:12 stashcache.ldas.cit kernel: spl_init+0x193/0x1000 [spl]
Nov 16 10:59:12 stashcache.ldas.cit kernel: do_one_initcall+0x46/0x1d0
Nov 16 10:59:12 stashcache.ldas.cit kernel: ? do_init_module+0x22/0x230
Nov 16 10:59:12 stashcache.ldas.cit kernel: ? kmem_cache_alloc_trace+0x142/0x280
Nov 16 10:59:12 stashcache.ldas.cit kernel: do_init_module+0x5a/0x230
Nov 16 10:59:12 stashcache.ldas.cit kernel: load_module+0x14bf/0x17f0
Nov 16 10:59:12 stashcache.ldas.cit kernel: ? __do_sys_finit_module+0xb1/0x110
Nov 16 10:59:12 stashcache.ldas.cit kernel: __do_sys_finit_module+0xb1/0x110
Nov 16 10:59:12 stashcache.ldas.cit kernel: do_syscall_64+0x5b/0x1b0
Nov 16 10:59:12 stashcache.ldas.cit kernel: entry_SYSCALL_64_after_hwframe+0x61/0xc6
Nov 16 10:59:12 stashcache.ldas.cit kernel: RIP: 0033:0x7f4270f6b91d
Nov 16 10:59:12 stashcache.ldas.cit kernel: Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 3b 55 38 00 f7 d8 64 89 01 48
Nov 16 10:59:12 stashcache.ldas.cit kernel: RSP: 002b:00007fff8ab90548 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
Nov 16 10:59:12 stashcache.ldas.cit kernel: RAX: ffffffffffffffda RBX: 000055a7230c2900 RCX: 00007f4270f6b91d
Nov 16 10:59:12 stashcache.ldas.cit kernel: RDX: 0000000000000000 RSI: 000055a7214a08b6 RDI: 0000000000000003
Nov 16 10:59:12 stashcache.ldas.cit kernel: RBP: 000055a7214a08b6 R08: 0000000000000000 R09: 0000000000000000
Nov 16 10:59:12 stashcache.ldas.cit kernel: R10: 0000000000000003 R11: 0000000000000246 R12: 0000000000000000
Nov 16 10:59:12 stashcache.ldas.cit kernel: R13: 000055a7230c28b0 R14: 0000000000040000 R15: 0000000000000000
Message from syslogd@stashcache.ldas.cit at Nov 16 10:59:40 ...
kernel:watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [modprobe:80625]
Nov 16 10:59:40 stashcache.ldas.cit kernel: watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [modprobe:80625]
Nov 16 10:59:40 stashcache.ldas.cit kernel: Modules linked in: spl(OE+) nfsv3 nfs_acl nfs lockd grace fscache 8021q garp mrp stp llc nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nf_tables_set nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink sunrpc intel_rapl_msr intel_rapl_common isst_if_common skx_edac iTCO_wdt iTCO_vendor_support x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl ipmi_ssif intel_cstate mlx5_ib ib_uverbs pcspkr ib_core intel_uncore joydev mei_me i2c_i801 lpc_ich mei ioatdma acpi_ipmi ipmi_si dax_pmem_compat ipmi_devintf device_dax ipmi_msghandler dax_pmem_core acpi_pad acpi_power_meter binfmt_misc xfs libcrc32c raid1 nd_pmem nd_btt sd_mod sg mlx5_core ast i2c_algo_bit drm_vram_helper drm_kms_helper syscopyarea sysfillrect mpt3sas sysimgblt fb_sys_fops drm_ttm_helper ttm ahci nvme raid_class libahci mlxfw pci_hyperv_intf
Nov 16 10:59:40 stashcache.ldas.cit kernel: drm ixgbe scsi_transport_sas nvme_core crc32c_intel libata tls t10_pi psample mdio dca nfit libnvdimm dm_mirror dm_region_hash dm_log dm_mod [last unloaded: zunicode]
Nov 16 10:59:40 stashcache.ldas.cit kernel: CPU: 0 PID: 80625 Comm: modprobe Kdump: loaded Tainted: P OEL --------- - - 4.18.0-425.3.1.el8.x86_64 #1
Nov 16 10:59:40 stashcache.ldas.cit kernel: Hardware name: Supermicro SYS-2029U-TN24R4T/X11DPU, BIOS 3.8 08/19/2022
Nov 16 10:59:40 stashcache.ldas.cit kernel: RIP: 0010:native_queued_spin_lock_slowpath+0x17b/0x1c0
Nov 16 10:59:40 stashcache.ldas.cit kernel: Code: 74 22 48 89 c1 0f 0d 08 eb 20 f3 90 8b 07 85 c0 75 f8 f0 0f b1 17 75 f2 65 ff 0d 6c 5e 0d 75 e9 1b b6 aa 00 31 c9 eb 02 f3 90 <8b> 07 66 85 c0 75 f7 41 89 c0 66 45 31 c0 44 39 c6 74 20 c6 07 01
Nov 16 10:59:40 stashcache.ldas.cit kernel: RSP: 0018:ffff9b606e40fc30 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff13
Nov 16 10:59:40 stashcache.ldas.cit kernel: RAX: 0000000000040001 RBX: 0000000000000002 RCX: 0000000000000000
Nov 16 10:59:40 stashcache.ldas.cit kernel: RDX: ffff8ccd8082bcc0 RSI: 0000000000040000 RDI: ffff8c9f079b3cd0
Nov 16 10:59:40 stashcache.ldas.cit kernel: RBP: ffff8c9f079b3cd0 R08: ffff9b606e40fbd8 R09: ffff8c9f079b2000
Nov 16 10:59:40 stashcache.ldas.cit kernel: R10: ffff8c9f410a6b40 R11: 0000000000000246 R12: 0000000000400001
Nov 16 10:59:40 stashcache.ldas.cit kernel: R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000008000
Nov 16 10:59:40 stashcache.ldas.cit kernel: FS: 00007f4272058740(0000) GS:ffff8ccd80800000(0000) knlGS:0000000000000000
Nov 16 10:59:40 stashcache.ldas.cit kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 16 10:59:40 stashcache.ldas.cit kernel: CR2: 00007f4271034f80 CR3: 000000010f4be003 CR4: 00000000007706f0
Nov 16 10:59:40 stashcache.ldas.cit kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Nov 16 10:59:40 stashcache.ldas.cit kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Nov 16 10:59:40 stashcache.ldas.cit kernel: PKRU: 55555554
Nov 16 10:59:40 stashcache.ldas.cit kernel: Call Trace:
Nov 16 10:59:40 stashcache.ldas.cit kernel: tsd_hash_search.isra.4+0x7e/0x90 [spl]
Nov 16 10:59:40 stashcache.ldas.cit kernel: tsd_create+0x8b/0x160 [spl]
Nov 16 10:59:40 stashcache.ldas.cit kernel: ? 0xffffffffc0790000
Nov 16 10:59:40 stashcache.ldas.cit kernel: spl_taskq_init+0x2d/0x180 [spl]
Nov 16 10:59:40 stashcache.ldas.cit kernel: spl_init+0x193/0x1000 [spl]
Nov 16 10:59:40 stashcache.ldas.cit kernel: do_one_initcall+0x46/0x1d0
Nov 16 10:59:40 stashcache.ldas.cit kernel: ? do_init_module+0x22/0x230
Nov 16 10:59:40 stashcache.ldas.cit kernel: ? kmem_cache_alloc_trace+0x142/0x280
Nov 16 10:59:40 stashcache.ldas.cit kernel: do_init_module+0x5a/0x230
Nov 16 10:59:40 stashcache.ldas.cit kernel: load_module+0x14bf/0x17f0
Nov 16 10:59:40 stashcache.ldas.cit kernel: ? __do_sys_finit_module+0xb1/0x110
Nov 16 10:59:40 stashcache.ldas.cit kernel: __do_sys_finit_module+0xb1/0x110
Nov 16 10:59:40 stashcache.ldas.cit kernel: do_syscall_64+0x5b/0x1b0
Nov 16 10:59:40 stashcache.ldas.cit kernel: entry_SYSCALL_64_after_hwframe+0x61/0xc6
Nov 16 10:59:40 stashcache.ldas.cit kernel: RIP: 0033:0x7f4270f6b91d
Nov 16 10:59:40 stashcache.ldas.cit kernel: Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 3b 55 38 00 f7 d8 64 89 01 48
Nov 16 10:59:40 stashcache.ldas.cit kernel: RSP: 002b:00007fff8ab90548 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
Nov 16 10:59:40 stashcache.ldas.cit kernel: RAX: ffffffffffffffda RBX: 000055a7230c2900 RCX: 00007f4270f6b91d
Nov 16 10:59:40 stashcache.ldas.cit kernel: RDX: 0000000000000000 RSI: 000055a7214a08b6 RDI: 0000000000000003
Nov 16 10:59:40 stashcache.ldas.cit kernel: RBP: 000055a7214a08b6 R08: 0000000000000000 R09: 0000000000000000
Nov 16 10:59:40 stashcache.ldas.cit kernel: R10: 0000000000000003 R11: 0000000000000246 R12: 0000000000000000
Nov 16 10:59:40 stashcache.ldas.cit kernel: R13: 000055a7230c28b0 R14: 0000000000040000 R15: 0000000000000000
</pre>
I’m not sure what the solution is yet.
Thanks!
tbuck
November 17, 2022, 6:58pm
3
Hey,
Thanks for getting back to me on this and looking into it. After I submitted this I saw that report that looks a lot like what I am seeing.
Thank you!
Is zfs supported on Rocky 8.x?
tbuck
November 17, 2022, 7:37pm
5
Yes as far as I am aware it is. Here is a link for installing on Rocky. I am not + about the 8.7 build though as with 8.5 → 8.6 was no problem.
https://docs.rockylinux.org/books/lxd_server/02-zfs_setup/
iwalker
November 17, 2022, 7:39pm
6
Well, the install comes from a third-party repo (zfsonlinux): 1 Install and Configuration - Documentation which means you can install it. Supported, well if it doesn’t work, I guess for the ZFS stuff means asking zfsonlinux to address those issues if the module doesn’t work in the kernel you are attempting to use it with. They would need to ensure the module that their repositories provide, that it works for the system you are installing it on. Maybe that’s why it works on early kernels pre-8.7 as they haven’t prepared it for newer ones yet.
dali
November 17, 2022, 7:58pm
7
1 Like
tbuck
November 17, 2022, 8:07pm
8
That is kind of what I was at first thinking. Very possible that the repo for zfsonlinux was not ready for the 8.7 release. I am now having issues with the prior kernel booting with the JBOD powered up but when the JBOD os powered down no problem with kernel version 4.18.0-372.32.1.el8_6.x86_64
Thanks for the links I will review when I have a chance…
1 Like