System clock wrong on boot

Hi.

We have a number of RL8 deployments that always log warnings similar to the following on boot:

Dec 07 16:55:14 localhost chronyd[875]: System clock wrong by 2.229643 seconds
Dec 07 16:55:16 localhost chronyd[875]: System clock was stepped by 2.229643 seconds

The anomaly is always around 2 seconds. The only other thing I can find that seems to a common factor is that they all use the kvm-clock as the clock source before switching to tsc, whereas the systems that don’t have this problem use tsc straight away:

[    0.000000] kvm-clock: Using msrs 4b564d01 and 4b564d00
[    0.000000] kvm-clock: using sched offset of 34029268534 cycles
[    0.000000] clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns
[    0.000000] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 1910969940391419 ns
[    0.001000] clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 30580167144 ns
[    0.001000] clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 0x240933eba6e, max_idle_ns: 440795246008 ns
[    0.018057] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 1911260446275000 ns
[    0.108005] PTP clock support registered
[    0.127018] clocksource: Switched to clocksource kvm-clock
[    0.150980] clocksource: acpi_pm: mask: 0xffffff max_cycles: 0xffffff, max_idle_ns: 2085701024 ns
[    0.604369] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x240933eba6e, max_idle_ns: 440795246008 ns
[    0.606334] clocksource: Switched to clocksource tsc
[    0.849162] sched_clock: Marking stable (849157167, 0)->(1717868158, -868710991)
[    0.912771] rtc_cmos 00:00: setting system clock to 2022-12-07 16:54:56 UTC (1670432096)

Most of the systems are AWS EC2 instances, but it also includes other virtualised environments that use KVM. We don’t ever seem to get the warnings logged on systems that are built on dedicated hardware or other virtualised environments such as VMware ESX.

Does anyone know what the cause, and hopefully fix, is?

Thanks in advance.

Hello @invadetl , I’m no expert on this at all, so I’m hoping someone else will chime in, but did you take a look at this and verify it is set up correctly? Set the time for your Linux instance - Amazon Elastic Compute Cloud

I realize that the issue isn’t ONLY on AWS, but it might lead you in the right direction. Not sure.

Hi. We are using the default chrony configuration, which uses the following pool:

pool 2.rocky.pool.ntp.org iburst

I may be wrong but, I don’t think NTP is the problem here. I believe the warnings are informing us that that chrony had to correct the system clock because it was set incorrectly on boot from the hardware clock.

I did originally raise this with Amazon thinking it was uniquely being caused by their infrastructure, we went round in circles a bit and then I discovered it was happening on non AWS instances.

I did find this Red Hat knowledge base article but we don’t have a subscription so I can’t tell if it is related or not:

Sign up for the free developer subscription. It will give you access to all the KB articles.

1 Like

Hi. I asked Red Hat about this. The Developer subscription is only for individuals, and we are a commercial business. I also asked about paying for a subscription just to get access to their knowledge base, but they said this wasn’t possible if we weren’t going to be using RHEL.

The recommendation is to set up auditing (auditd), to track any process that makes a change to the system time.

but it also includes other virtualised environments that use KVM. We don’t ever seem to get the warnings logged on systems that are built on dedicated hardware or other virtualised environments such as VMware ESX

That sentence is not very clear, which exact environments work, and which don’t?

Hi. I will set up auditing as follows and monitor:

auditctl -a exit,always -F arch=b64 -S clock_settime -S adjtimex -S settimeofday -k clock-adjusted
auditctl -a exit,always -F arch=b32 -S clock_settime -S adjtimex -S settimeofday -k clock-adjusted

As far as I am aware, from the point chrony starts it keeps the system clock in sync. The warning is only on boot at the point chrony starts. Even if the system has simply been rebooted.

I used the “virt-what” command to determine the underlying infrastructure. The following do log the warnings:
aws & kvm
kvm

The following don’t:
null (i.e. hardware)
vmware
nutanix_ahv

Hi. Finally got round to doing some auditing. The only thing that changes the system clock while the system is booted is chronyd:

time->Fri Jan 27 12:11:36 2023
type=PROCTITLE msg=audit(1674821496.856:1394): proctitle="/usr/sbin/chronyd"
type=TIME_INJOFFSET msg=audit(1674821496.856:1394): sec=3 nsec=574592853
type=TIME_ADJNTPVAL msg=audit(1674821496.856:1394): op=freq old=31722831872000 new=32425246720000
type=SYSCALL msg=audit(1674821496.856:1394): arch=c000003e syscall=159 success=yes exit=0 a0=7ffeb3675140 a1=0 a2=2710 a3=6330a items=0 ppid=1 pid=967 auid=4294967295 uid=996 gid=993 euid=996 suid=996 fsuid=996 egid=993 sgid=993 fsgid=993 tty=(none) ses=4294967295 comm="chronyd" exe="/usr/sbin/chronyd" subj=system_u:system_r:chronyd_t:s0 key="adjtime"

What I’ve noticed is that the RTC is not being kept in sync on a number of our systems. eg. From the timedatectl command:

physical
               Local time: Fri 2023-01-27 14:53:59 CET
           Universal time: Fri 2023-01-27 13:53:59 UTC
                 RTC time: Fri 2023-01-27 13:53:59
                Time zone: Europe/Brussels (CET, +0100)
System clock synchronized: yes
              NTP service: active
          RTC in local TZ: no

vmware
               Local time: Fri 2023-01-27 14:54:10 CET
           Universal time: Fri 2023-01-27 13:54:10 UTC
                 RTC time: Fri 2023-01-27 13:54:10
                Time zone: Europe/Copenhagen (CET, +0100)
System clock synchronized: yes
              NTP service: active
          RTC in local TZ: no

kvm
               Local time: Fri 2023-01-27 14:54:12 CET
           Universal time: Fri 2023-01-27 13:54:12 UTC
                 RTC time: Fri 2023-01-27 13:54:11
                Time zone: Europe/Oslo (CET, +0100)
System clock synchronized: yes
              NTP service: active
          RTC in local TZ: no

nutanix_ahv
               Local time: Fri 2023-01-27 14:54:13 CET
           Universal time: Fri 2023-01-27 13:54:13 UTC
                 RTC time: Fri 2023-01-27 14:54:11
                Time zone: Europe/Stockholm (CET, +0100)
System clock synchronized: yes
              NTP service: active
          RTC in local TZ: no

aws
               Local time: Fri 2023-01-27 13:51:29 GMT
           Universal time: Fri 2023-01-27 13:51:29 UTC
                 RTC time: Fri 2023-01-27 13:51:03
                Time zone: Europe/London (GMT, +0000)
System clock synchronized: yes
              NTP service: active
          RTC in local TZ: no

which explains why chronyd has to step the system clock on boot where the RTC is not accurate.

The chronyd configuration is identical on all these systems, so I’m not sure why the RTC not being kept in sync when the virtualisation is kvm, nutanix_ahv or aws?