« Return to Thread: 3w-xxxx / 3ware 8006-2LP corruption issues using Xen kernel

Re: 3w-xxxx / 3ware 8006-2LP corruption issues using Xen kernel

by Bas Verhoeven-3 :: Rate this Message:

Reply to Author | View in Thread

Holm Kapschitzki wrote:
> Bas Verhoeven schrieb:
Hi Holm,

>
> look at my ealier post, i describe the same problem.
>
> http://www.nabble.com/dom0---tar:-Skipping-to-next-header-td16558409.html
>
> so i get the error with etch 32 bit / 64 bit, xen 3.1 / 3.2 , with
> 2.6.18 kernel xen , with gentoo kernel 2.6.20r6 xen. It wasnt all the
> time. But i have to reboot the maschine to get it solved for a while.
> So i testet it with ca. 5 machines, setup in different ways with other
> kernels.
In a way I'm happy I'm not the only one experiencing this problem. Are
you using the exact same controller as I am?

I did experience some issues when I would remove most of the memory; so
the system would be left with 1GB of memory. At that point, running my
script would cause several errors, ending up in the partition becoming
read-only:

    PCI-DMA: Out of SW-IOMMU space for 65536 bytes at device 0000:02:01.0
    3w-xxxx: tw_map_scsi_sg_data(): pci_map_sg() failed.
    PCI-DMA: Out of SW-IOMMU space for 65536 bytes at device 0000:02:01.0
    3w-xxxx: tw_map_scsi_sg_data(): pci_map_sg() failed.
    ...
    sd 0:0:0:0: SCSI error: return code = 0x00070000
    end_request: I/O error, dev sda, sector 3068774
    Buffer I/O error on device dm-0, logical block 321289
    lost page write due to I/O error on dm-0
    Buffer I/O error on device dm-0, logical block 321290
    lost page write due to I/O error on dm-0
    ...
    end_request: I/O error, dev sda, sector 11794406
    Aborting journal on device dm-0.
    ext3_abort called.
    EXT3-fs error (device dm-0): ext3_journal_start_sb: Detected aborted
    journal
    Remounting filesystem read-only
    __journal_remove_journal_head: freeing b_frozen_data


This problems seems to be unrelated tho, and some googling pointed me to
some 'swiotlb' kernel parameter
(https://bugzilla.novell.com/show_bug.cgi?id=299641), which I set to 32M
and seems to run OK for now.
Data is still being written corrupted to disk tho.
>
> I think i could be a kernel compile parameter? or via chipset in
> relation to 3ware raid controller?
Well, I hardly doubt it's something in the hardware itself. That just
does not explain why everything works fine under a non-Xen kernel.
All kernels I tried have the 3ware driver loaded as a module. The
drivers under both kernels appear to be the same:

    p-dom0:/usr/src/xen-3.2.0/linux-2.6.18-xen.hg# sha1sum
    drivers/scsi/3w-x*
    d9da8960f6e98b783b4893cde51a303d97ce98d8  drivers/scsi/3w-xxxx.c
    2610261f86b4eb05a5d08c1f90f09410f1eb7c98  drivers/scsi/3w-xxxx.h

    p-dom0:/usr/src/linux-2.6.18.8# sha1sum drivers/scsi/3w-x*
    d9da8960f6e98b783b4893cde51a303d97ce98d8  drivers/scsi/3w-xxxx.c
    2610261f86b4eb05a5d08c1f90f09410f1eb7c98  drivers/scsi/3w-xxxx.h

So whatever is breaking stuff, must be something in the Xen code? I'm
going to compile a kernel with the 3w-xxxx driver compiled in, but I
doubt that helps.

Is there even anyone that uses the same controller and has no problems
at all?

Cheers,

Bas Verhoeven

>
> Greets Holm
>
>
> _______________________________________________
> Xen-users mailing list
> Xen-users@...
> http://lists.xensource.com/xen-users


_______________________________________________
Xen-users mailing list
Xen-users@...
http://lists.xensource.com/xen-users

 « Return to Thread: 3w-xxxx / 3ware 8006-2LP corruption issues using Xen kernel