Hi everyone,
for the past two months or so I’ve been seeing some of my Rocky Linux 9.4 systems sometimes crash, freeze, or processes getting OOM-killed during the night.
The logs reveal that dnf makecache startet seconds before the crash (triggered via systemd dnf-makecache.timer
) but never reach the point where the completion message Metadata cache created.
is logged.
/var/log/messages
Sep 19 06:33:02 hostname systemd[1]: Starting dnf makecache...
Sep 19 06:33:02 hostname dnf[62376]: Amazon Corretto 41 kB/s | 2.9 kB 00:00
Sep 19 06:33:02 hostname dnf[62376]: Rocky Linux 9 - Base 1.4 MB/s | 4.1 kB 00:00
Sep 19 06:33:02 hostname dnf[62376]: Rocky Linux 9 - Base 44 MB/s | 2.3 MB 00:00
Sep 19 06:33:02 hostname dnf[62376]: Rocky Linux 9 - AppStream 1.4 MB/s | 4.5 kB 00:00
Sep 19 06:33:03 hostname dnf[62376]: Rocky Linux 9 - AppStream 33 MB/s | 8.0 MB 00:00
Sep 19 06:33:04 hostname dnf[62376]: Rocky Linux 9 - Extras 1.0 MB/s | 2.9 kB 00:00
Sep 19 06:33:04 hostname dnf[62376]: Rocky Linux 9 - HighAvailability 1.7 MB/s | 4.0 kB 00:00
Sep 19 06:33:04 hostname dnf[62376]: Extra Packages for Enterprise Linux 9 - x86_64 1.4 MB/s | 4.3 kB 00:00
Sep 19 06:33:04 hostname dnf[62376]: Extra Packages for Enterprise Linux 9 - x86_64 52 MB/s | 23 MB 00:00
/var/log/dnf.log
2024-09-19T06:33:04+0200 DEBUG reviving: failed for 'localmirror-epel', mismatched repomd.
2024-09-19T06:33:04+0200 DEBUG repo: downloading from remote: localmirror-epel
So far I know that
- This only affects Rocky Linux 9.4 systems
- All affected systems do not have excessive free Memory ressources (maybe 500-800MB) but were running fine the months and years prior (some since weeks after the initial release of Rocky 9.0)
This leads me to the conclusion that
- either there is a memory leak during the
dnf makecache
operation - Something has significantly changed about the way
dnf makecache
operates
Have some of you observed similar issues in the past months?
I’ve also found RL9 dnf OOMs with EPEL but only on one machine! - #6 by iwalker which looks like it could the same issue.
Update: I’ve checked the dnf.rpm.log
. Judging by date this could relate to the following dnf updates from May:
2024-05-13T23:17:35+0200 SUBDEBUG Upgrade: libdnf-0.69.0-8.el9.x86_64
2024-05-13T23:18:10+0200 SUBDEBUG Upgrade: python3-libdnf-0.69.0-8.el9.x86_64
2024-05-13T23:18:11+0200 SUBDEBUG Upgrade: dnf-data-4.14.0-9.el9.noarch
2024-05-13T23:18:11+0200 SUBDEBUG Upgrade: python3-dnf-4.14.0-9.el9.noarch
2024-05-13T23:18:12+0200 SUBDEBUG Upgrade: dnf-4.14.0-9.el9.noarch
2024-05-13T23:18:12+0200 SUBDEBUG Upgrade: python3-dnf-plugins-core-4.3.0-13.el9.noarch
2024-05-13T23:18:12+0200 SUBDEBUG Upgrade: dnf-plugins-core-4.3.0-13.el9.noarch