我正在监视一个Cpanel(centos)服务器,它有一个2核cpu(4个虚拟cpu核心),它似乎过载,因为我使用top获得了这个值:
load average: 11.80,13.30,13.02 cpu(s): 42.2%us,11.7%sy,0.0%ni,35.6%id,10.1%wa,0.1%hi,0.3%si,0.0%st
但是,如果我查看进程列表(使用top或ps)没有进程使用更多1%
此外,进程cpu使用率(%)的总和等于4,如果我甚至假设0%值是舍入数字,并将其更改为0.04(使用1个十进制数字舍入为0),则总和为11(仍小于100%).
我怎样才能正确解释这些数据?是否有一些隐藏的过程会使我的cpu过载.
解决方法
ps手册页中的完整状态列表是
D Uninterruptible sleep (usually IO) R Running or runnable (on run queue) S Interruptible sleep (waiting for an event to complete) T Stopped,either by a job control signal or because it is being traced. W paging (not valid since the 2.6.xx kernel) X dead (should never be seen) Z Defunct ("zombie") process,terminated but not reaped by its parent.
样本输出
F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME CMD 4 S 0 1 0 0 80 0 - 4906 poll_s ? 00:00:23 init 1 S 0 2 0 0 80 0 - 0 kthrea ? 00:00:02 kthreadd 1 R 0 3 0 99 80 0 - 0 ? 01:00:02 runner 1 D 0 4 0 1 80 0 - 0 ? 01:00:02 loader
如果这些是您唯一的进程,我们会看到cpu占用“跑步者”的负载大约为2,1,而等待磁盘的装载程序则为1.
非常精确的是Wikipedia上提供的信息
An idle computer has a load number of 0. Each process using or waiting for cpu (the ready queue or run queue) increments the load number by 1. Most UNIX systems count only processes in the running (on cpu) or runnable (waiting for cpu) states. However,Linux also includes processes in uninterruptible sleep states (usually waiting for disk activity),which can lead to markedly different results if many processes remain blocked in I/O due to a busy or stalled I/O system.07001 This,for example,includes processes blocking due to an NFS server failure or to slow media (e.g.,USB 1.x storage devices). Such circumstances can result in an elevated load average,which does not reflect an actual increase in cpu use (but still gives an idea on how long users have to wait).