SMART正在说明我服务器硬盘上的一个待处理扇区.我已经阅读了许多文章,建议使用hdparm“轻松”强制磁盘重新定位坏扇区,但我找不到正确的方法来使用它.
我的“smartctl”中的一些信息:
- Error 95 occurred at disk power-on lifetime: 20184 hours (841 days + 0 hours)
- When the command that caused the error occurred,the device was active or idle.
- After command completion occurred,registers were:
- ER ST SC SN CL CH DH
- -- -- -- -- -- -- --
- 40 51 00 d7 55 dd 02 Error: UNC at LBA = 0x02dd55d7 = 48059863
- Commands leading to the command that caused the error were:
- CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
- -- -- -- -- -- -- -- -- ---------------- --------------------
- c8 00 08 d6 55 dd e2 00 18d+05:13:42.421 READ DMA
- 27 00 00 00 00 00 e0 00 18d+05:13:42.392 READ NATIVE MAX ADDRESS EXT
- ec 00 00 00 00 00 a0 02 18d+05:13:42.378 IDENTIFY DEVICE
- ef 03 46 00 00 00 a0 02 18d+05:13:42.355 SET FEATURES [Set transfer mode]
- 27 00 00 00 00 00 e0 00 18d+05:13:42.327 READ NATIVE MAX ADDRESS EXT
- SMART Self-test log structure revision number 1
- Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
- # 1 Extended offline Completed: read failure 90% 20194 48059863
- # 2 Short offline Completed without error 00% 15161 -
有了“糟糕的LBA”(48059863),我该如何使用hdparm?参数“–read-sector”和“–write-sector”应该具有哪种类型的地址?
如果我发出命令hdparm –read-sector 48095863 / dev / sda,它会读取并转储数据.如果这个命令是正确的,我应该期待一个I / O错误,对吗?
相反,它转储数据:
- $./hdparm --read-sector 48059863 /dev/sda
- /dev/sda:
- reading sector 48059863: succeeded
- 4b50 5d1b 7563 a932 618d 1f81 4514 2343
- 8a16 3342 5e36 2591 3b4e 762a 4dd7 037f
- 6a32 6996 816f 573f eee1 bc24 eed4 206e
- (...)
解决方法
如果出于某种原因,您希望尝试清除这些坏扇区,并且您不关心驱动器的现有内容,则下面的shell代码段可能有所帮助.我在一款较旧的希捷Barracuda硬盘上测试了这款产品,该产品远远超过了它的保修期.它可能不适用于其他驱动器型号或制造商,但如果您必须编写脚本,它应该让您走上正确的道路.它会破坏驱动器上的所有内容.
您可能更喜欢运行badblocks,hdparm安全擦除(SE)(https://wiki.archlinux.org/index.php/Securely_wipe_disk)或其他一些实际为此设计的工具.甚至制造商也提供像SeaTools这样的工具(有一个32位的linux’企业’版本,谷歌它).
在执行此操作之前,请确保有问题的驱动器完全未使用/未安装.另外,我知道,虽然循环,没有任何借口.这是一个黑客,你可以让它变得更好……
- baddrive=/dev/sdb
- badsect=1
- while true; do
- echo Testing from LBA $badsect
- smartctl -t select,${badsect}-max ${baddrive} 2>&1 >> /dev/null
- echo "Waiting for test to stop (each dot is 5 sec)"
- while [ "$(smartctl -l selective ${baddrive} | awk '/^ *1/{print substr($4,1,9)}')" != "Completed" ]; do
- echo -n .
- sleep 5
- done
- echo
- badsect=$(smartctl -l selective ${baddrive} | awk '/# 1 Selective offline Completed: read failure/ {print $10}')
- [ $badsect = "-" ] && exit 0
- echo Attempting to fix sector $badsect on $baddrive
- hdparm --repair-sector ${badsect} --yes-i-know-what-i-am-doing $baddrive
- echo Continuning test
- done
使用’selftest’方法的一个优点是负载由驱动器固件处理,因此它连接的PC不像dd或badblock那样被加载.
注意:对不起,我犯了一个错误,正确的条件是这样的:
- while [ "$(smartctl -l selective ${baddrive} | awk '/^ *1/{print $4}')" = "Self_test_in_progess" ]; do
并且脚本的退出条件变为:
- [ $badsect = "-" ] || [ "$badsect" = "" ] && exit 0