NFS-Ganesha segfault in svc_rqst_rearm_events() on Rocky Linux 9 with Ceph

Hello,

I’m encountering a critical issue with NFS-Ganesha running natively on Rocky Linux 9. Specifically, I’m using NFS-Ganesha in conjunction with Ceph (v18.2.7), and I’m experiencing random segfaults when handling NFS client connections.

After analysis, the segfault consistently occurs inside the function:
svc_rqst_rearm_events()

Looking deeper, it seems related to the changes discussed here:

The patch in PR #322 appears to address this exact problem, but applying it on Rocky Linux 9 is proving quite difficult:

  • I tried patching the current ntirpc source, but the code layout has changed and the patch fails.
  • Building ntirpc from source on Rocky 9 is complicated due to missing development packages like liburcu-devel or libnsl2-devel.

So far, I haven’t found a precompiled package of ntirpc with this fix, and rebuilding the full Ceph NFS stack with a custom ntirpc is not a trivial option.

Has anyone successfully applied this patch or solved this issue on Rocky Linux 9?

  • Are there compatible updated ntirpc or ganesha RPMs available with this fix?
  • Any recommendations to work around or mitigate the segfault until an official fix is available?

Any help or guidance would be greatly appreciated.

Thanks in advance!

Here’s an excerpt of the crash log:

Program terminated with signal SIGSEGV, Segmentation fault.
#0 svc_rqst_rearm_events (xprt=0x7f3c0001f710, ev_flags=0x2) at svc_rqst.c:372
#1 0x00007f3d2b5adbc5 in some_function_triggering_event_rearm (…)
from /usr/lib64/libntirpc.so.4
#2 0x00007f3d2b5ac4e0 in svc_rqst_epoll_loop ()
from /usr/lib64/libntirpc.so.4
#3 0x00007f3d2b5ace99 in svc_rqst_run ()
from /usr/lib64/libntirpc.so.4
#4 0x00007f3d2c1c6b97 in nfs_init_and_run ()
from /usr/lib64/libganesha_nfsd.so

[Tue Aug 5 14:34:05 2025] ganesha.nfsd[2065277]: segfault at 50 ip 00007f3e8423032e sp 00007f3cd57f9210 error 4 in libntirpc.so.5.8[7f3e84215000+2c000] likely on CPU 27 (core 3, socket 0)
[Tue Aug 5 14:34:05 2025] Code: 47 20 66 41 89 86 f2 00 00 00 41 bf 01 00 00 00 b9 40 00 00 00 e9 af fd ff ff 66 90 48 8b 85 f8 00 00 00 48 8b 40 08 4c 8b 28 <45> 8b 65 50 49 8b 75 68 41 8b be 28 02 00 00 b9 40 00 00 00 e8 29

Note: Although I initially mentioned “native deployment”, I am actually running Ganesha inside a Docker container, not directly on the host system. The container is based on a ceph ganesha -nfs 18.2.7 default container.

Hello,

we had similar issues. I sent a patch to the package maintainers, who incorporated it.
It should be solved with the next release.

https://gitlab.com/CentOS/storage/libntirpc/-/blob/c9s-sig-storage-nfs-ganesha-5/0001-src-svc_vc.c.patch?ref_type=heads

Best
Lars

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.