我们在Infiniband总线(ipoib)上运行CentOS 6.3 iscsi服务器(16GB RAM).
Sep 3 23:22:20 stor4 kernel: tgtd: page allocation failure. order:2,mode:0x20 Sep 3 23:22:20 stor4 kernel: Pid: 3637,comm: tgtd Not tainted 2.6.32 #1 Sep 3 23:22:20 stor4 kernel: Call Trace: Sep 3 23:22:20 stor4 kernel: [] ? __alloc_pages_nodemask+0x77f/0x940 Sep 3 23:22:20 stor4 kernel: [] ? kmem_getpages+0x62/0x170 Sep 3 23:22:20 stor4 kernel: [] ? fallback_alloc+0x1ba/0x270 Sep 3 23:22:20 stor4 kernel: [] ? cache_grow+0x2cf/0x320 Sep 3 23:22:20 stor4 kernel: [] ? ____cache_alloc_node+0x99/0x160 Sep 3 23:22:20 stor4 kernel: [] ? pskb_expand_head+0x64/0x270 Sep 3 23:22:20 stor4 kernel: [] ? __kmalloc+0x189/0x220 Sep 3 23:22:20 stor4 kernel: [] ? pskb_expand_head+0x64/0x270 Sep 3 23:22:20 stor4 kernel: [] ? __pskb_pull_tail+0x2aa/0x360 Sep 3 23:22:20 stor4 kernel: [] ? tcp_init_tso_segs+0x37/0x50 Sep 3 23:22:20 stor4 kernel: [] ? dev_queue_xmit+0x4bb/0x6f0 Sep 3 23:22:20 stor4 kernel: [] ? neigh_connected_output+0xbd/0x100 Sep 3 23:22:20 stor4 kernel: [] ? ip_finish_output+0x237/0x310 Sep 3 23:22:20 stor4 kernel: [] ? ip_output+0xb8/0xc0 Sep 3 23:22:20 stor4 kernel: [] ? __ip_local_out+0x9f/0xb0 Sep 3 23:22:20 stor4 kernel: [] ? ip_local_out+0x25/0x30 Sep 3 23:22:20 stor4 kernel: [] ? ip_queue_xmit+0x190/0x420 Sep 3 23:22:20 stor4 kernel: [] ? sock_aio_write+0x167/0x180 Sep 3 23:22:20 stor4 kernel: [] ? tcp_transmit_skb+0x3fe/0x7b0 Sep 3 23:22:20 stor4 kernel: [] ? tcp_write_xmit+0x1fb/0xa20 Sep 3 23:22:20 stor4 kernel: [] ? __tcp_push_pending_frames+0x30/0xe0 Sep 3 23:22:20 stor4 kernel: [] ? tcp_push_pending_frames+0x33/0x40 Sep 3 23:22:20 stor4 kernel: [] ? do_tcp_setsockopt+0x3d6/0x480 Sep 3 23:22:20 stor4 kernel: [] ? tcp_setsockopt+0x2a/0x30 Sep 3 23:22:20 stor4 kernel: [] ? sock_common_setsockopt+0x14/0x20 Sep 3 23:22:20 stor4 kernel: [] ? sys_setsockopt+0x7f/0xe0 Sep 3 23:22:20 stor4 kernel: [] ? system_call_fastpath+0x16/0x1b Sep 3 23:22:20 stor4 kernel: Mem-Info: Sep 3 23:22:20 stor4 kernel: Node 0 DMA per-cpu: Sep 3 23:22:20 stor4 kernel: cpu 0: hi: 0,btch: 1 usd: 0 Sep 3 23:22:20 stor4 kernel: cpu 1: hi: 0,btch: 1 usd: 0 Sep 3 23:22:20 stor4 kernel: cpu 2: hi: 0,btch: 1 usd: 0 Sep 3 23:22:20 stor4 kernel: cpu 3: hi: 0,btch: 1 usd: 0 Sep 3 23:22:20 stor4 kernel: Node 0 DMA32 per-cpu: Sep 3 23:22:20 stor4 kernel: cpu 0: hi: 186,btch: 31 usd: 183 Sep 3 23:22:20 stor4 kernel: cpu 1: hi: 186,btch: 31 usd: 23 Sep 3 23:22:20 stor4 kernel: cpu 2: hi: 186,btch: 31 usd: 183 Sep 3 23:22:20 stor4 kernel: cpu 3: hi: 186,btch: 31 usd: 181 Sep 3 23:22:20 stor4 kernel: Node 0 Normal per-cpu: Sep 3 23:22:20 stor4 kernel: cpu 0: hi: 186,btch: 31 usd: 171 Sep 3 23:22:20 stor4 kernel: cpu 1: hi: 186,btch: 31 usd: 29 Sep 3 23:22:20 stor4 kernel: cpu 2: hi: 186,btch: 31 usd: 32 Sep 3 23:22:20 stor4 kernel: cpu 3: hi: 186,btch: 31 usd: 32 Sep 3 23:22:20 stor4 kernel: active_anon:1875 inactive_anon:2473 isolated_anon:0 Sep 3 23:22:20 stor4 kernel: active_file:1243637 inactive_file:2505055 isolated_file:0 Sep 3 23:22:20 stor4 kernel: unevictable:0 dirty:268338 writeback:0 unstable:0 Sep 3 23:22:20 stor4 kernel: free:86050 slab_reclaimable:132377 slab_unreclaimable:23744 Sep 3 23:22:20 stor4 kernel: mapped:1293 shmem:222 pagetables:720 bounce:0 Sep 3 23:22:20 stor4 kernel: Node 0 DMA free:15732kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15332kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes Sep 3 23:22:20 stor4 kernel: lowmem_reserve[]: 0 2172 16060 16060 Sep 3 23:22:20 stor4 kernel: Node 0 DMA32 free:107544kB min:18268kB low:22832kB high:27400kB active_anon:468kB inactive_anon:2364kB active_file:566208kB inactive_file:976112kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:2224900kB mlocked:0kB dirty:96816kB writeback:0kB mapped:908kB shmem:12kB slab_reclaimable:176940kB slab_unreclaimable:968kB kernel_stack:64kB pagetables:192kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no Sep 3 23:22:20 stor4 kernel: lowmem_reserve[]: 0 0 13887 13887 Sep 3 23:22:20 stor4 kernel: Node 0 Normal free:220924kB min:116772kB low:145964kB high:175156kB active_anon:7032kB inactive_anon:7528kB active_file:4408340kB inactive_file:9044108kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:14220800kB mlocked:0kB dirty:976536kB writeback:0kB mapped:4264kB shmem:876kB slab_reclaimable:352568kB slab_unreclaimable:94008kB kernel_stack:2048kB pagetables:2688kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no Sep 3 23:22:20 stor4 kernel: lowmem_reserve[]: 0 0 0 0 Sep 3 23:22:20 stor4 kernel: Node 0 DMA: 1*4kB 0*8kB 1*16kB 1*32kB 1*64kB 0*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15732kB Sep 3 23:22:20 stor4 kernel: Node 0 DMA32: 16305*4kB 4381*8kB 353*16kB 8*32kB 1*64kB 1*128kB 0*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 107900kB Sep 3 23:22:20 stor4 kernel: Node 0 Normal: 14548*4kB 14808*8kB 2420*16kB 31*32kB 5*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 1*4096kB = 220784kB Sep 3 23:22:20 stor4 kernel: 3748822 total pagecache pages Sep 3 23:22:20 stor4 kernel: 0 pages in swap cache Sep 3 23:22:20 stor4 kernel: Swap cache stats: add 0,delete 0,find 0/0 Sep 3 23:22:20 stor4 kernel: Free swap = 975864kB Sep 3 23:22:20 stor4 kernel: Total swap = 975864kB Sep 3 23:22:20 stor4 kernel: 4194303 pages RAM Sep 3 23:22:20 stor4 kernel: 126915 pages reserved Sep 3 23:22:20 stor4 kernel: 3753534 pages shared Sep 3 23:22:20 stor4 kernel: 213500 pages non-shared
TCP堆栈和VM配置:
net.core.rmem_max = 83886080 net.core.wmem_max = 83886080 net.core.rmem_default = 65536 net.core.wmem_default = 65536 net.ipv4.tcp_rmem = 40960 1048560 4194304 net.ipv4.tcp_wmem = 40960 196608 4194304 net.ipv4.tcp_mem = 16388608 16388608 16388608 vm.min_free_kbytes=135168
其他调整:
/sbin/blockdev --setra 16384 /dev/sdb echo 2048 > /sys/block/sdb/queue/nr_requests
问题可能在哪里?谢谢.
解决方法
你可以尝试一些事情……但IPoIB上的iSCSI听起来有点混乱.显然,如果您使用Infiniband,性能必须至关重要.
>错误之外,性能如何?
>这是可重复的吗?你可以按需触发它还是只是在dmesg环形缓冲区堆积的消息?
>您在已安装的iSCSI设备上使用了哪些文件系统?这可能会对我的建议产生影响.
无论如何,既然你在CentOS 6.3上,我会认真考虑启用tuned-adm profile套装.对于您,如果尚未安装,请运行yum install tuned tuned-utils并尝试“企业存储”配置文件:
tuned-adm profile企业存储
这会将您的I / O电梯移动到deadline scheduler,将kernel.sched_min_ granularity_ns更改为10ms,对vm子系统进行一些调整,删除写入障碍,修改cpu调控器并提升磁盘预读.您还可以将sysctl和sysfs设置移动到自定义配置文件.
恢复原始设置可以通过tuned-adm关闭完成.这些命令可以安全地即时运行.你能测试并报告吗?