Slow SSH connection

Dear community,

I have a small HPC cluster running under Rocky Linux 8.5.

When I try to connect via ssh between the different nodes, the connection is sometimes extremely slow, especially when the node’s CPU is at 100% usage (it takes about 30-60 seconds to connect, even if the node is completely free the connection takes several seconds).

I’m sure it’s not a network problem, we have a dedicated 10G LAN network.

It seems that high CPU usage may be the problem but our old cluster based on CentOS 6.5 does not have this problem, even if the CPU is at 100% the connection is instantaneous.

Does anyone have any ideas or suggestions on how to solve this?

Thank you.

best regards,
rax

When you type ‘ssh’ quite a lot happens behind the scenes.

Try putting the client into verbose mode and you might see it “pause” at a specific point, and then continue.

Run ping first to make sure you see fast ping, but slow ssh.

Reverse IP lookup for the nodes?

Make sure the forward and reverse host lookup works on both the client and the server.

host -av $HOSTNAME
host -av $IP

If there is inconsistency or delay, this can introduce ssh startup latency.

/etc/ssh/sshd_config
UseDNS no

update-crypto-policies --show

take a look /usr/share/crypto-policies/opensshserver…