SMART and bad block reallocation fun
I’ve spent last night trying to find out how to handle some bad blocks reported by smartd, on the hard disk of a old laptop. The Bad block HOWTO for smartmoontools has lots of useful information, but too many calculations to be done by hand. To ease on the task, I’ve written a script to help on the task. The script simply does the calculations described on the HOWTO and shows you the result.
This is a sample run:
# ./lba2fs.py /dev/sda 11475078 # SMART says LBA 11475078 is bad lba2fs, by Eduardo Habkost <ehabkost@raisama.net> If you want to know what to do with the output of this program, check: http://smartmontools.sourceforge.net/badblockhowto.html [/dev/sda sector 11475078] Press Return to automatically probe, or enter command: cmd> I think I've found: partition table Checking the partition list... /dev/sda2 sector 11073453 [/dev/sda2 sector 11073453] Press Return to automatically probe, or enter command: cmd> I think I've found: LVM volume Checking the PE where the block is located pe_start: 384 sectors PE: 168 Checking on which LV the PE is located Found PE range on map (LV) Checking maps of LV... /dev/VolGroup00/LogVol00 sector 11073069 [/dev/VolGroup00/LogVol00 sector 11073069] Press Return to automatically probe, or enter command: cmd> I think I've found: ext3 filesystem Checking ext2 fs block... Block size: 4096 Block: 1384133 debugfs 1.41.4 (27-Jan-2009) GOOD: ext2fs block 1384133 at /dev/VolGroup00/LogVol00, not in use You can zero the block running the following command, but: 1) Don't do that if the device is in use (e.g. filesystem mounted) 2) *You will lose data* that is stored on the block. It looks like It is safe to do that on this block, but be careful. Are your backups up to date? 8) dd if=/dev/zero of=/dev/VolGroup00/LogVol00 bs=4096 count=1 seek=1384133 I recommend doing a read-test on the block first, to see if an I/O error is returned. Use the 'read' command for that. [ext2fs block 1384133 at /dev/VolGroup00/LogVol00, not in use] Press Return to automatically probe, or enter command: cmd>
As you can see above, fortunately on my case the bad block was unused on the ext2 filesystem, and I could safely write to it to force the drive to reallocate the bad sector, and the I/O errors are gone.
The script just try to read from the devices, with no code to write to them, so it should be always safe to run it. However, be very careful when using the numbers calculated by it to write to a disk sector. Always keep your backups up to date.
Syndicated 2009-03-22 20:01:20 from Eduardo Habkost / diary » In English