RAID1 file corruption(?)

View: New views
7 Messages — Rating Filter:   Alert me  

RAID1 file corruption(?)

by steves-6 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hello,

I've configured a software raid 1 on two sata drives, using a single
partition on each drive to create the array.  When transferring large
files (100 to 1500mb) to the mount point using scp the transfer almost
always results in a corrupt file, by which I mean that the md5sum of the
file doesn't match the source.

I'm able to rm the file and transfer it again and it then seems to work
fine.

I've run fsck.ext3 on the md device and the raid itself shows clean.

This machine is just being built and tested so I can delete files, kill
the raid, etc as needed to troubleshoot the problem.

Any ideas?


/dev/md1:
        Version : 00.90
  Creation Time : Fri Jun 26 13:00:21 2009
     Raid Level : raid1
     Array Size : 488383936 (465.76 GiB 500.11 GB)
  Used Dev Size : 488383936 (465.76 GiB 500.11 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 1
    Persistence : Superblock is persistent

    Update Time : Wed Jul  8 06:25:34 2009
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

           UUID : ad2cec31:2053319c:eb15f7e1:187cf839 (local to host ord)
         Events : 0.26

    Number   Major   Minor   RaidDevice State
       0       8        1        0      active sync   /dev/sda1
       1       8       17        1      active sync   /dev/sdb1


--
To UNSUBSCRIBE, email to debian-user-REQUEST@...
with a subject of "unsubscribe". Trouble? Contact listmaster@...


Re: RAID1 file corruption(?)

by Georgi Naplatanov :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hello Steve,

before years i had similar problem. Check your hard disks on the target
machine. May be some of them are broken.

Best regards
Georgi

Steve wrote:

> Hello,
>
> I've configured a software raid 1 on two sata drives, using a single
> partition on each drive to create the array.  When transferring large
> files (100 to 1500mb) to the mount point using scp the transfer almost
> always results in a corrupt file, by which I mean that the md5sum of the
> file doesn't match the source.
>
> I'm able to rm the file and transfer it again and it then seems to work
> fine.
>
> I've run fsck.ext3 on the md device and the raid itself shows clean.
>
> This machine is just being built and tested so I can delete files, kill
> the raid, etc as needed to troubleshoot the problem.
>
> Any ideas?
>
>
> /dev/md1:
>         Version : 00.90
>   Creation Time : Fri Jun 26 13:00:21 2009
>      Raid Level : raid1
>      Array Size : 488383936 (465.76 GiB 500.11 GB)
>   Used Dev Size : 488383936 (465.76 GiB 500.11 GB)
>    Raid Devices : 2
>   Total Devices : 2
> Preferred Minor : 1
>     Persistence : Superblock is persistent
>
>     Update Time : Wed Jul  8 06:25:34 2009
>           State : clean
>  Active Devices : 2
> Working Devices : 2
>  Failed Devices : 0
>   Spare Devices : 0
>
>            UUID : ad2cec31:2053319c:eb15f7e1:187cf839 (local to host ord)
>          Events : 0.26
>
>     Number   Major   Minor   RaidDevice State
>        0       8        1        0      active sync   /dev/sda1
>        1       8       17        1      active sync   /dev/sdb1
>
>


--
To UNSUBSCRIBE, email to debian-user-REQUEST@...
with a subject of "unsubscribe". Trouble? Contact listmaster@...


Re: RAID1 file corruption(?)

by lee-25 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Jul 08, 2009 at 08:28:16AM -0500, Steve wrote:

> When transferring large files (100 to 1500mb) to the mount point
> using scp the transfer almost always results in a corrupt file, by
> which I mean that the md5sum of the file doesn't match the source.

Is it possible that scp is the problem?

You could try to disable the RAID and use a single disk as the target
instead. If you still have corrupted files, try the other disk. If
still corrupted, I would strongly suspect scp. In that case, you could
put a source file onto one disk and copy it to the other with cp and
check; maybe you can do the cp test with a RAID device as target.

If you have a Realtek network card/chip involved somewhere, first get
a decent network card to replace it with (replace all of them if there
are several). If you don't want to replace them, try to copy your
files over NFS and see what happens ...


I'd really like to know if there's a problem like that with
RAID-1. I'm using it since about three years and it has always been
working fine, though I never did a test like that.


--
To UNSUBSCRIBE, email to debian-user-REQUEST@...
with a subject of "unsubscribe". Trouble? Contact listmaster@...


Re: RAID1 file corruption(?)

by Andreas Juch-8 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Am Wed, 8 Jul 2009 21:38:06 -0600
schrieb lee <lee@...>:

> If you have a Realtek network card/chip involved somewhere, first get
> a decent network card to replace it with (replace all of them if there
> are several). If you don't want to replace them, try to copy your
> files over NFS and see what happens ...

I think SSH should guarantee the integrity of the transfered files, so
the realtek cards shouldn't be a problem. But replacing Realtek NICs
seems to be generally a good idea, I had one that did only deliver ~25%
of a Intel e100's throughput.

Andreas


signature.asc (204 bytes) Download Attachment

Re: RAID1 file corruption(?)

by steves-6 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Jul 08, 2009 at 09:38:06PM -0600, lee wrote:
> On Wed, Jul 08, 2009 at 08:28:16AM -0500, Steve wrote:
>
> > When transferring large files (100 to 1500mb) to the mount point
> > using scp the transfer almost always results in a corrupt file, by
> > which I mean that the md5sum of the file doesn't match the source.
>
> Is it possible that scp is the problem?

It's possible.  But it's weird to me that TCP wouldn't catch any
packet-type errors and retransmit even before it got to the application
layer, SCP.  I'm able to transfer files to a non-RAIDed system without
the corruption however this system has only RAID1 drives in it.  I
suppose I could break the RAID and just partition one of the drives
separately to eliminate that issue.

I'm hoping the problem will be solved with two new hard drives arriving
tomorrow.  This machine isn't in production yet so I'll swap out the
hard drives and try again.

Steve


--
To UNSUBSCRIBE, email to debian-user-REQUEST@...
with a subject of "unsubscribe". Trouble? Contact listmaster@...


Re: RAID1 file corruption(?)

by lee-25 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thu, Jul 09, 2009 at 06:05:16PM -0500, Steve wrote:

> I'm hoping the problem will be solved with two new hard drives arriving
> tomorrow.  This machine isn't in production yet so I'll swap out the
> hard drives and try again.

How did it turn out?


--
To UNSUBSCRIBE, email to debian-user-REQUEST@...
with a subject of "unsubscribe". Trouble? Contact listmaster@...


Re: RAID1 file corruption(?)

by steves-6 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


So far so good with the Samsung F1 drives.  Though admittedly I haven't
been transferring as large of files with it yet.

Steve

On Sat, Jul 11, 2009 at 10:50:24AM -0600, lee wrote:
> On Thu, Jul 09, 2009 at 06:05:16PM -0500, Steve wrote:
>
> > I'm hoping the problem will be solved with two new hard drives arriving
> > tomorrow.  This machine isn't in production yet so I'll swap out the
> > hard drives and try again.
>
> How did it turn out?


--
To UNSUBSCRIBE, email to debian-user-REQUEST@...
with a subject of "unsubscribe". Trouble? Contact listmaster@...