SMART正在说明我服务器硬盘上的一个待处理扇区.我已经阅读了许多文章,建议使用hdparm“轻松”强制磁盘重新定位坏扇区,但我找不到正确的方法来使用它.
我的“smartctl”中的一些信息:
Error 95 occurred at disk power-on lifetime: 20184 hours (841 days + 0 hours) When the command that caused the error occurred,the device was active or idle. After command completion occurred,registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 d7 55 dd 02 Error: UNC at LBA = 0x02dd55d7 = 48059863 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- c8 00 08 d6 55 dd e2 00 18d+05:13:42.421 READ DMA 27 00 00 00 00 00 e0 00 18d+05:13:42.392 READ NATIVE MAX ADDRESS EXT ec 00 00 00 00 00 a0 02 18d+05:13:42.378 IDENTIFY DEVICE ef 03 46 00 00 00 a0 02 18d+05:13:42.355 SET FEATURES [Set transfer mode] 27 00 00 00 00 00 e0 00 18d+05:13:42.327 READ NATIVE MAX ADDRESS EXT SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Completed: read failure 90% 20194 48059863 # 2 Short offline Completed without error 00% 15161 -
有了“糟糕的LBA”(48059863),我该如何使用hdparm?参数“–read-sector”和“–write-sector”应该具有哪种类型的地址?
如果我发出命令hdparm –read-sector 48095863 / dev / sda,它会读取并转储数据.如果这个命令是正确的,我应该期待一个I / O错误,对吗?
相反,它转储数据:
$./hdparm --read-sector 48059863 /dev/sda /dev/sda: reading sector 48059863: succeeded 4b50 5d1b 7563 a932 618d 1f81 4514 2343 8a16 3342 5e36 2591 3b4e 762a 4dd7 037f 6a32 6996 816f 573f eee1 bc24 eed4 206e (...)
解决方法
如果出于某种原因,您希望尝试清除这些坏扇区,并且您不关心驱动器的现有内容,则下面的shell代码段可能有所帮助.我在一款较旧的希捷Barracuda硬盘上测试了这款产品,该产品远远超过了它的保修期.它可能不适用于其他驱动器型号或制造商,但如果您必须编写脚本,它应该让您走上正确的道路.它会破坏驱动器上的所有内容.
您可能更喜欢运行badblocks,hdparm安全擦除(SE)(https://wiki.archlinux.org/index.php/Securely_wipe_disk)或其他一些实际为此设计的工具.甚至制造商也提供像SeaTools这样的工具(有一个32位的linux’企业’版本,谷歌它).
在执行此操作之前,请确保有问题的驱动器完全未使用/未安装.另外,我知道,虽然循环,没有任何借口.这是一个黑客,你可以让它变得更好……
baddrive=/dev/sdb badsect=1 while true; do echo Testing from LBA $badsect smartctl -t select,${badsect}-max ${baddrive} 2>&1 >> /dev/null echo "Waiting for test to stop (each dot is 5 sec)" while [ "$(smartctl -l selective ${baddrive} | awk '/^ *1/{print substr($4,1,9)}')" != "Completed" ]; do echo -n . sleep 5 done echo badsect=$(smartctl -l selective ${baddrive} | awk '/# 1 Selective offline Completed: read failure/ {print $10}') [ $badsect = "-" ] && exit 0 echo Attempting to fix sector $badsect on $baddrive hdparm --repair-sector ${badsect} --yes-i-know-what-i-am-doing $baddrive echo Continuning test done
使用’selftest’方法的一个优点是负载由驱动器固件处理,因此它连接的PC不像dd或badblock那样被加载.
注意:对不起,我犯了一个错误,正确的条件是这样的:
while [ "$(smartctl -l selective ${baddrive} | awk '/^ *1/{print $4}')" = "Self_test_in_progess" ]; do
并且脚本的退出条件变为:
[ $badsect = "-" ] || [ "$badsect" = "" ] && exit 0