For a 32 core CPU, and having following outputs for TOP and sar -q, How can I decide that my system is healthy or over loaded ?
I am looking from perspective of load average. What readings can be considered as healthy/worrying?

sar -q

10:00:01 AM runq-sz plist-sz ldavg-1 ldavg-5 ldavg-15
10:10:01 AM 16 3354 9.58 7.32 8.29

Top command

top - 14:00:00 up 16 days, 14:00, 1 user, load average: 7.43, 7.27, 7.23



If the load average is greater than the number of CPUs, the system is overwhelmed.

For a 32-core system, a load average of 9 is no problem. If the load average consistently stays over 32, then there is cause for concern.

Thanks Jamie for quick reply.
Can we refer any one of these 2 commands (top or sar -q) to refer to load average values?
At a given point of time both should show same values, isn't it?

You can refer to any of these commands, they report the same values.

  • "top" is real time
  • "sar -q" reads results of a cronjob scheduled to collect different system metrics every 10 minutes (default)

If you only care about current load average, uptime provides load average, but otherwise fewer details than top.

Hi, have 48CPU and my server load average constantly for 1 minute is 16.xx to 19.xx. how to determine if my server is healthy and how to know which process is in queue for waiting CPU.

10:56:29 AM runq-sz plist-sz ldavg-1 ldavg-5 ldavg-15 10:56:31 AM 20 1946 16.90 19.84 21.91

is there any recommendation or documentation for load average in Linus servers.