Thursday, September 4, 2014

Fix bad sectors in Linux with hdparm

Kernel messages like these are the begging of the end for a hard drive:

[4248398.645517] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[4248398.645522] ata2.00: BMDMA stat 0x24
[4248398.645527] ata2.00: failed command: READ DMA EXT
[4248398.645535] ata2.00: cmd 25/00:08:07:24:23/00:00:55:00:00/e0 tag 0 dma 4096 in
[4248398.645536]          res 51/40:00:0d:24:23/40:00:55:00:00/00 Emask 0x9 (media error)
[4248398.645540] ata2.00: status: { DRDY ERR }
[4248398.645543] ata2.00: error: { UNC }
[4248398.784319] ata2.00: configured for UDMA/133
[4248398.784340] sd 1:0:0:0: [sdb] Unhandled sense code
[4248398.784343] sd 1:0:0:0: [sdb]  Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[4248398.784349] sd 1:0:0:0: [sdb]  Sense Key : Medium Error [current] [descriptor]
[4248398.784354] Descriptor sense data with sense descriptors (in hex):
[4248398.784357]         72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
[4248398.784369]         55 23 24 0d
[4248398.784374] sd 1:0:0:0: [sdb]  Add. Sense: Unrecovered read error - auto reallocate failed
[4248398.784380] sd 1:0:0:0: [sdb] CDB: Read(10): 28 00 55 23 24 07 00 00 08 00
[4248398.784392] end_request: I/O error, dev sdb, sector 1428366349
[4248398.784419] ata2: EH complete
[4249453.881503] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[4249453.881510] ata2.00: failed command: READ SECTOR(S) EXT
[4249453.881519] ata2.00: cmd 24/00:01:0d:24:23/00:00:55:00:00/e0 tag 0 pio 512 in
[4249453.881520]          res 51/40:00:0d:24:23/40:00:55:00:00/00 Emask 0x9 (media error)
[4249453.881524] ata2.00: status: { DRDY ERR }
[4249453.881527] ata2.00: error: { UNC }
[4249454.020324] ata2.00: configured for UDMA/133
[4249454.020350] ata2: EH complete

What does it all mean?  Well, the very basic message is that there was a read error and the drive couldn't automatically move that data to another, presumably good sector, of the hard drive:
[4248398.784374] sd 1:0:0:0: [sdb]  Add. Sense: Unrecovered read error - auto reallocate failed

Skip two lines ahead and the kernel is telling us the drive and the sector where the error occurred.  In this case, /dev/sdb and sector 1428366349.  You can confirm, but running the following hdparm command (as root or with sudo):
root@tv:/home/khanh# hdparm --read-sector 1428366349 /dev/sdb

The output should look similar to the following, confirming that our sector has a read error:
/dev/sdb:reading sector 1428366349: FAILED: Input/output error

Most of the time, we're able to clear the error by writing a zero to the sector.  ***WARNING*** DOING THIS COULD/WILL IRREPARABLY DAMAGE THE FILE IN THIS SECTOR!!!
Of course, in my case, this drive is used as a DVR (hence the TV hostname) and just has a bunch of MPEG2 files for my TV recordings.  Putting a single zero somewhere in the file doesn't ruin the file beyond use.  So, we're going to write the zero and remind hdparm that we know what we're doing.
root@tv:/home/khanh# hdparm --yes-i-know-what-i-am-doing --write-sector 1428366349 /dev/sdb
/dev/sdb:
re-writing sector 1428366349: succeeded

The device sdb reports success in writing to the sector and now we should be able to read a nice clean zero from the sector with hdparm:
root@tv:/home/khanh# hdparm --read-sector 1428366349 /dev/sdb
/dev/sdb:reading sector 1428366349: succeeded0000 0000 0000 0000 0000 0000 0000 00000000 0000 0000 0000 0000 0000 0000 00000000 0000 0000 0000 0000 0000 0000 00000000 0000 0000 0000 0000 0000 0000 00000000 0000 0000 0000 0000 0000 0000 00000000 0000 0000 0000 0000 0000 0000 00000000 0000 0000 0000 0000 0000 0000 00000000 0000 0000 0000 0000 0000 0000 00000000 0000 0000 0000 0000 0000 0000 00000000 0000 0000 0000 0000 0000 0000 00000000 0000 0000 0000 0000 0000 0000 00000000 0000 0000 0000 0000 0000 0000 00000000 0000 0000 0000 0000 0000 0000 00000000 0000 0000 0000 0000 0000 0000 00000000 0000 0000 0000 0000 0000 0000 00000000 0000 0000 0000 0000 0000 0000 00000000 0000 0000 0000 0000 0000 0000 00000000 0000 0000 0000 0000 0000 0000 00000000 0000 0000 0000 0000 0000 0000 00000000 0000 0000 0000 0000 0000 0000 00000000 0000 0000 0000 0000 0000 0000 00000000 0000 0000 0000 0000 0000 0000 00000000 0000 0000 0000 0000 0000 0000 00000000 0000 0000 0000 0000 0000 0000 00000000 0000 0000 0000 0000 0000 0000 00000000 0000 0000 0000 0000 0000 0000 00000000 0000 0000 0000 0000 0000 0000 00000000 0000 0000 0000 0000 0000 0000 00000000 0000 0000 0000 0000 0000 0000 00000000 0000 0000 0000 0000 0000 0000 00000000 0000 0000 0000 0000 0000 0000 00000000 0000 0000 0000 0000 0000 0000 0000

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.