我有一个问题,当我在几小时后挂载我的分区后,它变为只读.目前我运行:
- fuser -m -k /dev/sdb1
- umount /dev/sdb1
- fsck -y /dev/sdb1
- mount /dev/sdb1
这允许我在读/写中重新安装它,但这是一个反复出现的问题.我的问题是如何解决这个问题.它看起来像硬件或软件问题?
目前这是在CentOS 64位盒子上.
当我运行dmesg时,我会看到这个:
- EXT3-fs error (device sdb1): ext3_lookup: unlinked inode 36127046 in dir #36126721
- Aborting journal on device sdb1.
- __journal_remove_journal_head: freeing b_committed_data
- __journal_remove_journal_head: freeing b_committed_data
- ext3_abort called.
- EXT3-fs error (device sdb1): ext3_journal_start_sb: Detected aborted journal
- Remounting filesystem read-only
“smartctl -a / dev / sdb1”返回
- smartctl version 5.38 [x86_64-redhat-linux-gnu] Copyright (C) 2002-8 Bruce Allen
- Home page is http://smartmontools.sourceforge.net/
- === START OF INFORMATION SECTION ===
- Device Model: ST31500341AS
- Serial Number: 9VS2RH1M
- Firmware Version: CC1H
- User Capacity: 1,500,301,910,016 bytes
- Device is: Not in smartctl database [for details use: -P showall]
- ATA Version is: 8
- ATA Standard is: ATA-8-ACS revision 4
- Local Time is: Fri Aug 13 16:50:33 2010 PDT
- SMART support is: Available - device has SMART capability.
- SMART support is: Enabled
- === START OF READ SMART DATA SECTION ===
- SMART overall-health self-assessment test result: PASSED
- General SMART Values:
- Offline data collection status: (0x82) Offline data collection activity
- was completed without error.
- Auto Offline Data Collection: Enabled.
- Self-test execution status: ( 246) Self-test routine in progress...
- 60% of test remaining.
- Total time to complete Offline
- data collection: ( 609) seconds.
- Offline data collection
- capabilities: (0x7b) SMART execute Offline immediate.
- Auto Offline data collection on/off support.
- Suspend Offline collection upon new
- command.
- Offline surface scan supported.
- Self-test supported.
- Conveyance Self-test supported.
- Selective Self-test supported.
- SMART capabilities: (0x0003) Saves SMART data before entering
- power-saving mode.
- Supports SMART auto save timer.
- Error logging capability: (0x01) Error logging supported.
- General Purpose Logging supported.
- Short self-test routine
- recommended polling time: ( 1) minutes.
- Extended self-test routine
- recommended polling time: ( 255) minutes.
- Conveyance self-test routine
- recommended polling time: ( 2) minutes.
- SCT capabilities: (0x103f) SCT Status supported.
- SCT Feature Control supported.
- SCT Data Table supported.
- SMART Attributes Data Structure revision number: 10
- Vendor Specific SMART Attributes with Thresholds:
- ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_Failed RAW_VALUE
- 1 Raw_Read_Error_Rate 0x000f 113 099 006 Pre-fail Always - 51209038
- 3 Spin_Up_Time 0x0003 100 100 000 Pre-fail Always - 0
- 4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 16
- 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 6
- 7 Seek_Error_Rate 0x000f 078 060 030 Pre-fail Always - 78095697
- 9 Power_On_Hours 0x0032 094 094 000 Old_age Always - 5685
- 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
- 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 16
- 184 Unknown_Attribute 0x0032 100 100 099 Old_age Always - 0
- 187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
- 188 Unknown_Attribute 0x0032 100 099 000 Old_age Always - 4295032833
- 189 High_Fly_Writes 0x003a 093 093 000 Old_age Always - 7
- 190 Airflow_Temperature_Cel 0x0022 068 057 045 Old_age Always - 32 (Lifetime Min/Max 31/35)
- 194 Temperature_Celsius 0x0022 032 043 000 Old_age Always - 32 (0 19 0 0)
- 195 Hardware_ECC_Recovered 0x001a 038 022 000 Old_age Always - 51209038
- 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
- 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
- 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
- 240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 243150983534133
- 241 Unknown_Attribute 0x0000 100 253 000 Old_age Offline - 230593735
- 242 Unknown_Attribute 0x0000 100 253 000 Old_age Offline - 2219959893
- SMART Error Log Version: 1
- No Errors Logged
- SMART Self-test log structure revision number 1
- Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
- # 1 Extended offline Self-test routine in progress 60% 5685 -
- SMART Selective self-test log data structure revision number 1
- SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
- 1 0 0 Not_testing
- 2 0 0 Not_testing
- 3 0 0 Not_testing
- 4 0 0 Not_testing
- 5 0 0 Not_testing
- Selective self-test flags (0x0):
- After scanning selected spans,do NOT read-scan remainder of disk.
- If Selective self-test is pending on power-up,resume after 0 minute delay.
我最终运行“e2fsck -fy / dev / sdb1”,它给了我这个:
- e2fsck 1.39 (29-May-2006)
- /dev/sdb1: recovering journal
- Pass 1: Checking inodes,blocks,and sizes
- Pass 2: Checking directory structure
- Entry 'primary.sqlite' in /cache/yum/updates (36094008) has deleted/unused inode 36094073. Clear? yes
- Entry 'repomd.xml' in /cache/yum/c5-testing (36126721) has deleted/unused inode 36127046. Clear? yes
- Pass 3: Checking directory connectivity
- Pass 4: Checking reference counts
- Unattached inode 20005231
- Connect to /lost+found? yes
- Inode 20005231 ref count is 2,should be 1. Fix? yes
- Inode 36094015 ref count is 1,should be 2. Fix? yes
- Inode 36094017 ref count is 1,should be 2. Fix? yes
- Unattached inode 36094024
- Connect to /lost+found? yes
- Inode 36094024 ref count is 2,should be 1. Fix? yes
- Unattached inode 36094068
- Connect to /lost+found? yes
- Inode 36094068 ref count is 2,should be 1. Fix? yes
- Unattached inode 36094076
- Connect to /lost+found? yes
- Inode 36094076 ref count is 2,should be 1. Fix? yes
- Unattached inode 36110353
- Connect to /lost+found? yes
- Inode 36110353 ref count is 2,should be 1. Fix? yes
- Inode 36110357 ref count is 1,should be 2. Fix? yes
- Unattached inode 36127047
- Connect to /lost+found? yes
- Inode 36127047 ref count is 2,should be 1. Fix? yes
- Unattached inode 73007330
- Connect to /lost+found? yes
- Inode 73007330 ref count is 2,should be 1. Fix? yes
- Unattached inode 73007331
- Connect to /lost+found? yes
- Inode 73007331 ref count is 2,should be 1. Fix? yes
- Pass 5: Checking group summary information
- Block bitmap differences: +161978373 -161984512 +161986577 -(161990662--161990663) +(161990668--161990669) -161992704 +161992715 -161994753 +161994778 -162000900
- Fix? yes
- Free blocks count wrong for group #356 (242,counted=240).
- Fix? yes
- Free blocks count wrong for group #375 (2086,counted=2064).
- Fix? yes
- Free blocks count wrong for group #2203 (3224,counted=3223).
- Fix? yes
- Free blocks count wrong for group #3564 (1,counted=3).
- Fix? yes
- Free blocks count wrong for group #3571 (2820,counted=2824).
- Fix? yes
- Free blocks count wrong (311466471,counted=311466452).
- Fix? yes
- Free inodes count wrong for group #2203 (16060,counted=16059).
- Fix? yes
- Free inodes count wrong (181859841,counted=181859840).
- Fix? yes
- /dev/sdb1: ***** FILE SYSTEM WAS MODIFIED *****
- /dev/sdb1: 1296896/183156736 files (1.5% non-contiguous),54817548/366284000 blocks
我假设那是导致HD进入只读状态的原因.我会及时通知你,并告诉你问题是否再次发生. (无论哪种方式我更换硬盘)
解决方法
文件系统可能安装了选项errors = remount-ro,顾名思义表示如果检测到错误,文件系统会立即设置为只读,以避免进一步损坏.
内核日志中会有信息(大多数Linux发行版上都是/var/log/kern.log).
下一步该做什么取决于原因.以下是最有可能的:
>它可能是一个失败的磁盘.通常,您会看到内核日志中报告的IO错误. smartctl -a / dev / sdb可以告诉你更多.尽快备份数据并更换磁盘.>这可能是你的RAM的一个问题.运行memtest只是为了确保.>这可能是一个内核错误.这对凡人来说很难诊断.确保为您的发行版发布了最新的内核.>由于不再适用的原因(例如现在已修复的内核错误),文件系统可能早先被损坏了.运行fsck应该可以解决问题,所以不幸的是,这种情况不适用于你.