LVM volumes have corrupt ext3 fs after replacing a failing drive in an underlying software raid 5 array

View: New views
4 Messages — Rating Filter:   Alert me  

LVM volumes have corrupt ext3 fs after replacing a failing drive in an underlying software raid 5 array

by Dustin Minnich :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi all. 

We have a server running Cent OS 4.7 kernel 2.6.9-78.0.5.ELsmp.

It has 6 300Gb SAS drives that are configured as follows:
  • /boot and / are in software raid 1 arrays with ext3 on them. 
  • A small software raid 0 array for speed with ext3 on it.
  • Swap is in a raid 0 array.  This is a leftover from a previous admin and it will be changed eventually. 
  • And a large software raid 5 array which houses LVM volumes that are formatted with ext3. 

We got some errors in the messages log sometime last week about "bad segments" on one of the drives. Doing a smartcl -t long /dev/sda confirmed that the drive wasn't happy.  We decided to replace it before things got worse.  


Here is the procedure we followed to replace the drive.....

1) backup failing disks partition table:
sfdisk -d /dev/sda > /etc/partitions.sda

2) fail and remove the disk from all of its arrays:
mdadm /dev/md0 --fail /dev/sda1 --remove /dev/sda1
...

3) turn off swap since its in a swap array. shutdown the swap array:
swapoff -a
mdadm -S /dev/md4

4) tell the OS to kill the drive for hot swap and pull the bad drive:
echo "scsi remove-single-device 0 0 0 0" > /proc/scsi/scsi

5) insert new drive and tell the OS to find it:
echo "scsi add-single-device 0 0 0 0" > /proc/scsi/scsi

6) create the partition table on the new disk:
sfdisk /dev/sda < /etc/partitions.sda

7) add the drive back to all of its arrays:
mdadm /dev/md0 --add /dev/sda1
...

8) re-create the swap array, format it, turn swap back on:
mdam --create --verbose /dev/md4 --level=0 --raid-devices=2 /dev/sda1 /dev/sdb1
mkswap /dev/md4
swapon -a



Everything seemed fine.  All arrays synced, data was accessible, so we went home and went to bed.

Sadly, things were not fine a few hours later.  One of our LVM volumes remounted itself readonly and our attempt to remount it rw led to the machine hard locking.  Rebooting the machine gave us nothing but fsck complaining and about TONs of inodes on all but two of the LVM volumes.  We ended up having to wipe the raid 5 array clean and restore from backups.  Here are the errors from when the kernel remounted the LVM volume ro.  Fsck complained in a very similar manner on the reboot. 

Dec 13 02:37:40 farrell kernel: EXT3-fs error (device dm-0): ext3_readdir: bad entry in directory #16335409: rec_len % 4 != 0 - offset=0, inode=3325888793, rec_len=36875, name_len=128
Dec 13 02:37:40 farrell kernel: Aborting journal on device dm-0.
Dec 13 02:37:42 farrell kernel: ext3_abort called.
Dec 13 02:37:42 farrell kernel: EXT3-fs error (device dm-0): ext3_journal_start_sb: Detected aborted journal
Dec 13 02:37:42 farrell kernel: Remounting filesystem read-only
Dec 13 02:45:08 farrell kernel: EXT3-fs error (device dm-0): ext3_readdir: bad entry in directory #10028449: rec_len % 4 != 0 - offset=0, inode=3453369297, rec_len=43883, name_len=176
Dec 13 02:45:57 farrell kernel: EXT3-fs error (device dm-0): ext3_readdir: bad entry in directory #17728222: rec_len % 4 != 0 - offset=0, inode=4248137906, rec_len=41903, name_len=120
Dec 13 02:48:01 farrell kernel: EXT3-fs error (device dm-0): ext3_readdir: bad entry in directory #37765142: rec_len % 4 != 0 - offset=0, inode=189228823, rec_len=2857, name_len=49
Dec 13 02:50:17 farrell kernel: EXT3-fs error (device dm-0): ext3_readdir: bad entry in directory #10027806: rec_len % 4 != 0 - offset=0, inode=2494885026, rec_len=6898, name_len=164
Dec 13 02:51:37 farrell kernel: EXT3-fs error (device dm-0): ext3_readdir: bad entry in directory #17089522: rec_len % 4 != 0 - offset=0, inode=3224714177, rec_len=24635, name_len=56
Dec 13 03:22:44 farrell kernel: EXT3-fs error (device dm-0): ext3_readdir: bad entry in directory #31687011: rec_len % 4 != 0 - offset=0, inode=934687742, rec_len=46451, name_len=57
Dec 13 03:37:43 farrell kernel: EXT3-fs error (device dm-3): ext3_readdir: bad entry in directory #1033375: directory entry across blocks - offset=0, inode=2770636924, rec_len=62712, name_len=4
Dec 13 03:37:43 farrell kernel: Aborting journal on device dm-3.
Dec 13 03:37:43 farrell kernel: ext3_abort called.
Dec 13 03:37:43 farrell kernel: EXT3-fs error (device dm-3): ext3_journal_start_sb: Detected aborted journal
Dec 13 03:37:43 farrell kernel: Remounting filesystem read-only
Dec 13 03:52:21 farrell kernel: EXT3-fs error (device dm-7): ext3_readdir: bad entry in directory #1442857: rec_len % 4 != 0 - offset=0, inode=732481902, rec_len=52555, name_len=181
Dec 13 03:52:21 farrell kernel: Aborting journal on device dm-7.
Dec 13 03:52:21 farrell kernel: ext3_abort called.
Dec 13 03:52:21 farrell kernel: EXT3-fs error (device dm-7): ext3_journal_start_sb: Detected aborted journal
Dec 13 03:52:21 farrell kernel: EXT3-fs error (device dm-7): ext3_readdir: bad entry in directory #1442857: rec_len % 4 != 0 - offset=0, inode=732481902, rec_len=52555, name_len=181
Dec 13 03:52:21 farrell kernel: Aborting journal on device dm-7.
Dec 13 03:52:21 farrell kernel: ext3_abort called.
Dec 13 03:52:21 farrell kernel: EXT3-fs error (device dm-7): ext3_journal_start_sb: Detected aborted journal
Dec 13 03:52:21 farrell kernel: Remounting filesystem read-only
Dec 13 03:53:08 farrell kernel: EXT3-fs error (device dm-7): ext3_readdir: bad entry in directory #1606426: rec_len % 4 != 0 - offset=0, inode=3306786808, rec_len=1430, name_len=20


Does anybody else run a similar setup, and if so, what do your disk replacement procedures look like?  Or has anybody ever ran into any similar errors for any other reason (ppl on google mention possible kernel bug)?
Basically, I'm just trying to figure out if something I did (or didn't do) caused the FS on the LVM volumes to get corrupt, and if not, what did. 



-- 
Dustin Minnich
Nicholas IT
613-8148

_______________________________________________
Dulug mailing list
Dulug@...
https://lists.dulug.duke.edu/mailman/listinfo/dulug

Re: LVM volumes have corrupt ext3 fs after replacing a failing drive in an underlying software raid 5 array

by seth vidal-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message



On Tue, 16 Dec 2008, Dustin Minnich wrote:

> Hi all. 
>
> We have a server running Cent OS 4.7 kernel 2.6.9-78.0.5.ELsmp.
>
> It has 6 300Gb SAS drives that are configured as follows:
>  *  /boot and / are in software raid 1 arrays with ext3 on them. 
>  *  A small software raid 0 array for speed with ext3 on it.
>  *  Swap is in a raid 0 array.  This is a leftover from a previous admin and it will be changed eventually. 
>  *  And a large software raid 5 array which houses LVM volumes that are formatted with ext3. 
>
> We got some errors in the messages log sometime last week about "bad segments" on one of the drives. Doing a smartcl -t
> long /dev/sda confirmed that the drive wasn't happy.  We decided to replace it before things got worse.  
>
>
> Here is the procedure we followed to replace the drive.....
>
> 1) backup failing disks partition table:
> sfdisk -d /dev/sda > /etc/partitions.sda
>
> 2) fail and remove the disk from all of its arrays:
> mdadm /dev/md0 --fail /dev/sda1 --remove /dev/sda1
> ...
> 3) turn off swap since its in a swap array. shutdown the swap array:
> swapoff -a
> mdadm -S /dev/md4
>
> 4) tell the OS to kill the drive for hot swap and pull the bad drive:
> echo "scsi remove-single-device 0 0 0 0" > /proc/scsi/scsi
>
> 5) insert new drive and tell the OS to find it:
> echo "scsi add-single-device 0 0 0 0" > /proc/scsi/scsi
>
> 6) create the partition table on the new disk:
> sfdisk /dev/sda < /etc/partitions.sda
>
> 7) add the drive back to all of its arrays:
> mdadm /dev/md0 --add /dev/sda1
> ...
>
> 8) re-create the swap array, format it, turn swap back on:
> mdam --create --verbose /dev/md4 --level=0 --raid-devices=2 /dev/sda1 /dev/sdb1
> mkswap /dev/md4
> swapon -a
>
>
>
> Everything seemed fine.  All arrays synced, data was accessible, so we went home and went to bed.
Did anything happen right before the new errors occurred? Specifically,
did your backups start or anything that started exercising the disks - or
more likely the cables/backplane?

Anything in the logs before the journal errors started?

-sv

_______________________________________________
Dulug mailing list
Dulug@...
https://lists.dulug.duke.edu/mailman/listinfo/dulug

Re: LVM volumes have corrupt ext3 fs after replacing a failing drive in an underlying software raid 5 array

by Dustin Minnich :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Actually, yeah, rdiff should have started running at 2:30 and the first
error was at 2:37.

I think the arrays finished re-buidling around 11 or 12 though.
How do you think the two incidents could be related?  Array wasn't
actually rebuilt successfully and the backup job was the first to notice
that when it couldn't find a specific file?  Or do you think the
constant heavy load had something to do with it?

Dustin Minnich
Nicholas IT
613-8148



Seth Vidal wrote:

>
>
> On Tue, 16 Dec 2008, Dustin Minnich wrote:
>
>> Hi all.
>>
>> We have a server running Cent OS 4.7 kernel 2.6.9-78.0.5.ELsmp.
>>
>> It has 6 300Gb SAS drives that are configured as follows:
>>  *  /boot and / are in software raid 1 arrays with ext3 on them.
>>  *  A small software raid 0 array for speed with ext3 on it.
>>  *  Swap is in a raid 0 array.  This is a leftover from a previous
>> admin and it will be changed eventually.
>>  *  And a large software raid 5 array which houses LVM volumes that
>> are formatted with ext3.
>>
>> We got some errors in the messages log sometime last week about "bad
>> segments" on one of the drives. Doing a smartcl -t
>> long /dev/sda confirmed that the drive wasn't happy.  We decided to
>> replace it before things got worse.  
>>
>>
>> Here is the procedure we followed to replace the drive.....
>>
>> 1) backup failing disks partition table:
>> sfdisk -d /dev/sda > /etc/partitions.sda
>>
>> 2) fail and remove the disk from all of its arrays:
>> mdadm /dev/md0 --fail /dev/sda1 --remove /dev/sda1
>> ...
>> 3) turn off swap since its in a swap array. shutdown the swap array:
>> swapoff -a
>> mdadm -S /dev/md4
>>
>> 4) tell the OS to kill the drive for hot swap and pull the bad drive:
>> echo "scsi remove-single-device 0 0 0 0" > /proc/scsi/scsi
>>
>> 5) insert new drive and tell the OS to find it:
>> echo "scsi add-single-device 0 0 0 0" > /proc/scsi/scsi
>>
>> 6) create the partition table on the new disk:
>> sfdisk /dev/sda < /etc/partitions.sda
>>
>> 7) add the drive back to all of its arrays:
>> mdadm /dev/md0 --add /dev/sda1
>> ...
>>
>> 8) re-create the swap array, format it, turn swap back on:
>> mdam --create --verbose /dev/md4 --level=0 --raid-devices=2 /dev/sda1
>> /dev/sdb1
>> mkswap /dev/md4
>> swapon -a
>>
>>
>>
>> Everything seemed fine.  All arrays synced, data was accessible, so
>> we went home and went to bed.
>
> Did anything happen right before the new errors occurred?
> Specifically, did your backups start or anything that started
> exercising the disks - or more likely the cables/backplane?
>
> Anything in the logs before the journal errors started?
>
> -sv

_______________________________________________
Dulug mailing list
Dulug@...
https://lists.dulug.duke.edu/mailman/listinfo/dulug

Re: LVM volumes have corrupt ext3 fs after replacing a failing drive in an underlying software raid 5 array

by seth vidal-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message



On Tue, 16 Dec 2008, Dustin Minnich wrote:

> Actually, yeah, rdiff should have started running at 2:30 and the first error
> was at 2:37.
>
> I think the arrays finished re-buidling around 11 or 12 though.
> How do you think the two incidents could be related?  Array wasn't actually
> rebuilt successfully and the backup job was the first to notice that when it
> couldn't find a specific file?  Or do you think the constant heavy load had
> something to do with it?
>

The array rebuild (unless you increased the resync rate) won't necessarily
hit the array very hard.

However, if you have a less-than-great backplane or adapter it could be
overheating or being overtaxed when in heavy use and simply losing its
mind.

Test it, if you can, Run a fast, hard, bonnie++ or tiobench test on it and
see if goes bonkers. If it does, call your hw rep and get them to swap the
backplane and any/all cables.

-sv

_______________________________________________
Dulug mailing list
Dulug@...
https://lists.dulug.duke.edu/mailman/listinfo/dulug