Performance or bottleneck analysis tools

Minseok · April 16, 2023, 8:08am

Hi,

My system has four 40GbE ports. Four threads are allocated to four cores using pthread_affinity_np(), and each thread has single UDP socket and uses independently allocated memory as a buffer and sends data using sendto(). However, while one thread can achieve a send throughput of around 5GB/s, running all four threads simultaneously results in a decrease in the throughput to around 4GB/s each.

I’d like to know the cause of this performance degradation needs to be identified.
Please let me know if there are any tools or methods for this?

Thank you.

Ticapsoriginal · April 16, 2023, 11:57am

in my view this may be related to the type of cores, concurrent and parallel programming used. Single-thread (st) and multi-thread (mt) processing and available infrastructure and limits … the processing work architecture may have different operation from the default . i use gunicorn and this is the indication:

gunicorn should only need 4-12 worker processes to handle hundreds or thousands of requests per second. gunicorn relies on the operating system to provide all of the load balancing when handling requests. generally we recommend (2 x $num_cores) + 1 as the number of workers to start off with.

but not only the number of workers is important, the processing demand of each core also influences.

Topic		Replies	Views
Receiving data through multiple recvfrom threads using a single socket Rocky Linux General	1	206	August 25, 2023
Slow network transfers? Rocky Linux General rocky-linux-8	9	123	April 8, 2024
Slow SSH connection Rocky Linux General	4	148	March 28, 2024
High-level networking question Rocky Linux General	1	154	December 14, 2023
Monitoring network packages Rocky Linux General	4	283	August 25, 2023

Performance or bottleneck analysis tools

Related Topics