pool: pod2 state: DEGRADED status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scrub: resilver in progress for 0h6m,0.05% done,237h17m to go config: NAME STATE READ WRITE CKSUM pod2 DEGRADED 0 0 29.3K raidz1-0 ONLINE 0 0 0 disk/by-id/scsi-SATA_ST3000DM001-9YN_W1F165XG ONLINE 0 0 0 disk/by-id/scsi-SATA_ST3000DM001-9YN_W1F1660X ONLINE 0 0 0 disk/by-id/scsi-SATA_ST3000DM001-9YN_W1F1678R ONLINE 0 0 0 disk/by-id/scsi-SATA_ST3000DM001-9YN_W1F1689F ONLINE 0 0 0 disk/by-id/scsi-SATA_ST3000DM001-9YN_W1F16AW9 ONLINE 0 0 0 raidz1-1 ONLINE 0 0 0 disk/by-id/scsi-SATA_ST3000DM001-9YN_W1F16C6E ONLINE 0 0 0 disk/by-id/scsi-SATA_ST3000DM001-9YN_W1F16C9F ONLINE 0 0 0 disk/by-id/scsi-SATA_ST3000DM001-9YN_W1F16FCD ONLINE 0 0 0 disk/by-id/scsi-SATA_ST3000DM001-9YN_W1F16JDQ ONLINE 0 0 0 disk/by-id/scsi-SATA_ST3000DM001-9YN_W1F17M6V ONLINE 0 0 0 raidz1-2 ONLINE 0 0 0 disk/by-id/scsi-SATA_ST3000DM001-9YN_W1F17MSZ ONLINE 0 0 0 disk/by-id/scsi-SATA_ST3000DM001-9YN_W1F17MXE ONLINE 0 0 0 disk/by-id/scsi-SATA_ST3000DM001-9YN_W1F17XKB ONLINE 0 0 0 disk/by-id/scsi-SATA_ST3000DM001-9YN_W1F17XMW ONLINE 0 0 0 disk/by-id/scsi-SATA_ST3000DM001-9YN_W1F17ZHY ONLINE 0 0 0 raidz1-3 ONLINE 0 0 0 disk/by-id/scsi-SATA_ST3000DM001-9YN_W1F18BM4 ONLINE 0 0 0 disk/by-id/scsi-SATA_ST3000DM001-9YN_W1F18BRF ONLINE 0 0 0 disk/by-id/scsi-SATA_ST3000DM001-9YN_W1F18XLP ONLINE 0 0 0 disk/by-id/scsi-SATA_ST3000DM001-9YN_Z1F09880 ONLINE 0 0 0 disk/by-id/scsi-SATA_ST3000DM001-9YN_Z1F098BE ONLINE 0 0 0 raidz1-4 DEGRADED 0 0 58.7K disk/by-id/scsi-SATA_ST3000DM001-9YN_Z1F09B0M ONLINE 0 0 0 spare-1 DEGRADED 0 0 0 disk/by-id/scsi-SATA_ST3000DM001-9YN_Z1F09BEN UNAVAIL 0 0 0 cannot open disk/by-id/scsi-SATA_ST3000DM001-1CH_W1F49M01 ONLINE 0 0 0 837K resilvered disk/by-id/scsi-SATA_ST3000DM001-9YN_Z1F0D6LC ONLINE 0 0 0 disk/by-id/scsi-SATA_ST3000DM001-9YN_Z1F0CWD1 ONLINE 0 0 0 spare-4 DEGRADED 0 0 0 disk/by-id/scsi-SATA_ST3000DM001-9YN_Z1F09C8G UNAVAIL 0 0 0 cannot open disk/by-id/scsi-SATA_ST3000DM001-1CH_W1F4A7ZE ONLINE 0 0 0 830K resilvered raidz1-5 ONLINE 0 0 0 disk/by-id/scsi-SATA_ST3000DM001-1CH_Z1F2KNQP ONLINE 0 0 0 disk/by-id/scsi-SATA_ST3000DM001-9YN_Z1F0BML0 ONLINE 0 0 0 disk/by-id/scsi-SATA_ST3000DM001-9YN_Z1F0BPV4 ONLINE 0 0 0 disk/by-id/scsi-SATA_ST3000DM001-9YN_Z1F0BPZP ONLINE 0 0 0 disk/by-id/scsi-SATA_ST3000DM001-9YN_Z1F0BQ78 ONLINE 0 0 0 raidz1-6 ONLINE 0 0 0 disk/by-id/scsi-SATA_ST3000DM001-9YN_Z1F0BQ9G ONLINE 0 0 0 disk/by-id/scsi-SATA_ST3000DM001-9YN_Z1F0BQDF ONLINE 0 0 0 disk/by-id/scsi-SATA_ST3000DM001-9YN_Z1F0BQFQ ONLINE 0 0 0 disk/by-id/scsi-SATA_ST3000DM001-9YN_Z1F0CW1A ONLINE 0 0 0 disk/by-id/scsi-SATA_ST3000DM001-9YN_Z1F0BV7M ONLINE 0 0 0 spares disk/by-id/scsi-SATA_ST3000DM001-1CH_W1F49M01 INUSE currently in use disk/by-id/scsi-SATA_ST3000DM001-1CH_W1F4A7ZE INUSE currently in use disk/by-id/scsi-SATA_ST3000DM001-1CH_W1F49MB1 AVAIL disk/by-id/scsi-SATA_ST3000DM001-1ER_Z5001SS2 AVAIL disk/by-id/scsi-SATA_ST3000DM001-1ER_Z5001R0F AVAIL errors: 37062187 data errors,use '-v' for a list
当第一个磁盘发生故障时,我用热备件替换它,它开始重新启动.在重新启动完成之前,第二个磁盘发生故障,因此我用另一个热备份替换了第二个磁盘.从那以后它将开始重新启动,完成大约50%,然后开始吞噬内存,直到它全部吃完并导致操作系统崩溃.
此时升级服务器上的RAM并不是一个简单的选择,我不清楚这样做是否可以保证解决方案.我知道在这个阶段会有数据丢失,但如果我可以牺牲这个RAIDZ的内容来保留池的其余部分,这是一个完全可以接受的结果.我正在将此服务器的内容备份到另一台服务器,但内存消耗问题每48小时左右强制重启(或崩溃),这会中断我的rsync备份,并重新启动rsync需要时间(它可以一旦它确定它停止的地方就恢复,但这需要很长时间).
我认为ZFS试图处理两个备用替换操作是内存消耗问题的根源,因此我想删除其中一个热备件,以便ZFS可以一次处理一个.但是,当我尝试分离其中一个备件时,我得到“无法分离/ dev / disk / by-id / scsi-SATA_ST3000DM001-1CH_W1F49M01:没有有效的副本”.也许我可以使用-f选项来强制操作,但我不清楚它的确切结果是什么,所以我想看看是否有人在继续之前有任何输入.
如果我可以让系统进入一个稳定的状态,它可以保持足够长的时间运行以备份完成,我计划将其关闭以进行大修,但是在目前情况下,它会陷入一些恢复循环.
解决方法
您在RAIDZ-1设置中有两个故障磁盘.您很可能正在查看一些数据丢失,并准备从备份中恢复.
另外,在我使用OpenSolaris / Solaris11时,RAIDZ已经证明非常不稳定.我建议不要在任何类型的生产工作量中使用它.
另外,为了强化ewwhite所说的,FUSE不是你最好的选择.我借此机会迁移到更稳定的东西(也许是FreeBSD 10).