偶尔,我们会在其中一个磁盘上遇到输入/输出错误.
我们的服务器(DELL PowerEdge R720,Ubuntu 14.04)使用Perc H710 Raid控制器,产生错误的磁盘是Dell 600GB SAS 6Gbps 15k 3.5″磁盘.
我们总是可以使用fsck.ext4来修复错误,但我们不知道是什么原因导致它们发生.
我们已将服务器固件更新到最新版本,并运行了我们能想到的所有测试.
我们还能做些什么来找到问题的根源?
编辑:
大约一个星期前我们联系了DELL,在他们指示我如何进行多次测试之后,他们得出结论,服务器很好,并且在测试中没有出现任何异常情况.
我无法为设备启用SMART支持:
$sudo smartctl -a /dev/sda smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.13.0-55-generic] (local build) Copyright (C) 2002-13,Bruce Allen,Christian Franke,www.smartmontools.org === START OF INFORMATION SECTION === Vendor: DELL Product: PERC H710 Revision: 3.13 User Capacity: 1,199,101,181,952 bytes [1.19 TB] Logical block size: 512 bytes Logical Unit id: 0x6b8ca3a0f210dc0019eead8c1111fb0a Serial number: 000afb11118cadee1900dc10f2a0a38c Device type: disk Local Time is: Wed Jul 8 10:47:35 2015 IDT SMART support is: Unavailable - device lacks SMART capability. === START OF READ SMART DATA SECTION === Error Counter logging not supported Device does not support Self Test logging
我试过了:
$sudo smartctl -s on /dev/sda smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.13.0-55-generic] (local build) Copyright (C) 2002-13,www.smartmontools.org === START OF ENABLE/DISABLE COMMANDS SECTION === unable to fetch IEC (SMART) mode page [unsupported field in scsi command] A mandatory SMART command Failed: exiting. To continue,add one or more '-T permissive' options.
此外,我不知道该怎么做(谷歌搜索没有帮助):
$sudo hdparm -I /dev/sda /dev/sda: SG_IO: bad/missing sense data,sb[]: 70 00 05 00 00 00 00 0d 00 00 00 00 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ATA device,with non-removable media Standards: Likely used: 1 Configuration: Logical max current cylinders 0 0 heads 0 0 sectors/track 0 0 -- Logical/Physical Sector size: 512 bytes device size with M = 1024*1024: 0 MBytes device size with M = 1000*1000: 0 MBytes cache/buffer size = unknown Capabilities: IORDY not likely Cannot perform double-word IO R/W multiple sector transfer: not supported DMA: not supported PIO: pio0
欢迎任何建议!
你有一个驱动器在一个行为不端的RAID,并产生偶尔的错误?听起来像硬件问题,而且可能会变得更糟.您应该考虑更换驱动器.是的,这是昂贵的,但你的时间价值多少,如果整个驱动器在不合适的时刻向南走,会有多糟糕?