Card/cross map alignment broken?

View: New views
5 Messages — Rating Filter:   Alert me  

Card/cross map alignment broken?

by Wesley W. Terpstra :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I've noticed that the alpha port sometimes 'hangs'. The 'hang' is caused by a sudden burst of unaligned memory accesses which bring the kernel to its knees. I suppose if I waited long enough it would work, so it isn't really a hang. Obviously, I already use -align 8.

Due to the nature of the hang I have a strong suspicion that the GC is responsible. Things run smoothly until suddenly the program 'hangs', but changing GC parameters will move the location of the 'hang'. Looking through the GC code I strongly suspect that the crossMap is ending up unaligned. Placing it after the cardMap (a 1-byte aligned array) seems quite risky, but I lack the expertise in this area of the code to be certain this is a problem.

Can someone familiar with this code please check that
  s->generationalMaps.crossMap =
    (GC_crossMap) (s->heap.start + s->heap.size + cardMapSize);
indeed must be >8 bytes aligned?

Thank you!


_______________________________________________
MLton mailing list
MLton@...
http://mlton.org/mailman/listinfo/mlton

Re: Card/cross map alignment broken?

by Matthew Fluet-5 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, 13 Oct 2009, Wesley W. Terpstra wrote:

> Due to the nature of the hang I have a strong suspicion that the GC is
> responsible. Things run smoothly until suddenly the program 'hangs', but
> changing GC parameters will move the location of the 'hang'. Looking through
> the GC code I strongly suspect that the crossMap is ending up unaligned.
> Placing it after the cardMap (a 1-byte aligned array) seems quite risky, but
> I lack the expertise in this area of the code to be certain this is a
> problem.
>
> Can someone familiar with this code please check that
>  s->generationalMaps.crossMap =
>    (GC_crossMap) (s->heap.start + s->heap.size + cardMapSize);
> indeed must be >8 bytes aligned?

All of heap.size, cardMapSize, and crossMapSize are aligned according to
the system page size (s->sysvals.pagesize).  See sizeofCardMap and
sizeofCrossMap in <src>/runtime/gc/generational.c; under the current
parameters, they should always be the same size.  And heap.start is a
large mmap with a page aligned size, so should be page aligned.

If you run with '@MLton gc-messages', you'll see the address at which the
heap is allocated and the size of the card/cross map.

_______________________________________________
MLton mailing list
MLton@...
http://mlton.org/mailman/listinfo/mlton

Re: Card/cross map alignment broken?

by Wesley W. Terpstra :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, Oct 13, 2009 at 9:14 PM, Matthew Fluet <mtf@...> wrote:
All of heap.size, cardMapSize, and crossMapSize are aligned according to the system page size (s->sysvals.pagesize).

Ok, but it is definitely updateCrossMap that is doing the unaligned accesses. The moment the GC switches to generational behaviour, the performance of the system dies. I'm trying to narrow the problem down further, but it's painful... 30 minutes per binary search step. :p

gdb says the address of the trap is...
(gdb) break *(void*)0x0000000120d843ac
Breakpoint 1 at 0x120d843ac: file gc/generational.c, line 373.
=> s->generationalMaps.crossMap[cardIndex] = (GC_crossMapElem)offset;
... but we all know how reliable gcc is in this regard.. o.O


_______________________________________________
MLton mailing list
MLton@...
http://mlton.org/mailman/listinfo/mlton

Re: Card/cross map alignment broken?

by Wesley W. Terpstra :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Oct 14, 2009 at 2:10 AM, Wesley W. Terpstra <wesley@...> wrote:
Ok, but it is definitely updateCrossMap that is doing the unaligned accesses.

gdb says the address of the trap is...
(gdb) break *(void*)0x0000000120d843ac
Breakpoint 1 at 0x120d843ac: file gc/generational.c, line 373.
=> s->generationalMaps.crossMap[cardIndex] = (GC_crossMapElem)offset;

... shockingly, gdb was correct. It took me some time to wrap my mind around this, but there is indeed a bug in that line. Or rather, that line became buggy when r4683 tried to quell a warning. The warning was harmless, but the "fix" wasn't.

Here's what is going on:
  typedef uint8_t GC_crossMapStart __attribute__ ((aligned(4));
defines a type that is a uint8_t, which happens to be aligned on a 4-byte boundary. That's all fine and dandy. Make a pointer type to the object, well that pointer is 4-byte aligned. Swell.

Now deference that pointer at [1].

Uhm.

Well, the object type pointed to is aligned(4), so the (pointer+1) must be 4-byte aligned. I'm an alpha and loading a byte is expensive, but no problem, it's word aligned! Let's just load it using a word32 memory access. BAM! unaligned access.

Morale of the story? Don't mess around with __attribute__ ((aligned))! Two of the three places it is used in MLton are wrong and I'm not 100% sure the remaining use (pointer) is safe either. I assume that "pointer" is used only to refer to objects on the heap. That means that on an alpha it is actually 8-byte aligned. Telling the alpha it's less aligned than it actually is will either degrade performance or have no effect at all.

When you cast pointers you might get a warning, but gcc is also doing the 'right thing' for MLton's purposes. If I cast a char* to an int* gcc assumes that I know what I'm doing and that the char* pointer was really aligned to an int after all. That's why it issues a warning, to let you know it just increased its alignment assumptions. Telling gcc that the char* was 4-byte aligned quells the warning (on a 32-bit system) but doesn't improve the assembler. All it does is introduce potential bugs if you ever derefence that char* pointer at say [1] [2] or [3]. So there is absolutely nothing to gain (performance-wise) by declaring these types as aligned and everything to lose (correctness-wise).

As for the warnings, they aren't even warnings that are enabled by default in C! We turned them on with -Wcast-align. They are off by default because lots of correct C code (including ours) do casts like this. The align(4) hack also doesn't help on 64-bit ports...  I get flooded with these warnings on ia64 and alpha.

I am going to unilaterally remove all __attribute__ ((align))s and quell this class of warning by removing -Wcast-align. After my rant I assume no one dares to object. :)


_______________________________________________
MLton mailing list
MLton@...
http://mlton.org/mailman/listinfo/mlton

Re: Card/cross map alignment broken?

by Matthew Fluet-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Thanks for the explanation.  The use of aligning the "pointer" type
was meant for heap objects that are (at least) 4 byte aligned.

On Tue, Oct 13, 2009 at 9:43 PM, Wesley W. Terpstra <wesley@...> wrote:

> On Wed, Oct 14, 2009 at 2:10 AM, Wesley W. Terpstra <wesley@...>
> wrote:
>>
>> Ok, but it is definitely updateCrossMap that is doing the unaligned
>> accesses.
>> gdb says the address of the trap is...
>> (gdb) break *(void*)0x0000000120d843ac
>> Breakpoint 1 at 0x120d843ac: file gc/generational.c, line 373.
>> => s->generationalMaps.crossMap[cardIndex] = (GC_crossMapElem)offset;
>
> ... shockingly, gdb was correct. It took me some time to wrap my mind around
> this, but there is indeed a bug in that line. Or rather, that line became
> buggy when r4683 tried to quell a warning. The warning was harmless, but the
> "fix" wasn't.
>
> Here's what is going on:
>   typedef uint8_t GC_crossMapStart __attribute__ ((aligned(4));
> defines a type that is a uint8_t, which happens to be aligned on a 4-byte
> boundary. That's all fine and dandy. Make a pointer type to the object, well
> that pointer is 4-byte aligned. Swell.
>
> Now deference that pointer at [1].
>
> Uhm.
>
> Well, the object type pointed to is aligned(4), so the (pointer+1) must be
> 4-byte aligned. I'm an alpha and loading a byte is expensive, but no
> problem, it's word aligned! Let's just load it using a word32 memory access.
> BAM! unaligned access.
>
> Morale of the story? Don't mess around with __attribute__ ((aligned))! Two
> of the three places it is used in MLton are wrong and I'm not 100% sure the
> remaining use (pointer) is safe either. I assume that "pointer" is used only
> to refer to objects on the heap. That means that on an alpha it is actually
> 8-byte aligned. Telling the alpha it's less aligned than it actually is will
> either degrade performance or have no effect at all.
>
> When you cast pointers you might get a warning, but gcc is also doing the
> 'right thing' for MLton's purposes. If I cast a char* to an int* gcc assumes
> that I know what I'm doing and that the char* pointer was really aligned to
> an int after all. That's why it issues a warning, to let you know it just
> increased its alignment assumptions. Telling gcc that the char* was 4-byte
> aligned quells the warning (on a 32-bit system) but doesn't improve the
> assembler. All it does is introduce potential bugs if you ever derefence
> that char* pointer at say [1] [2] or [3]. So there is absolutely nothing to
> gain (performance-wise) by declaring these types as aligned and everything
> to lose (correctness-wise).
>
> As for the warnings, they aren't even warnings that are enabled by default
> in C! We turned them on with -Wcast-align. They are off by default because
> lots of correct C code (including ours) do casts like this. The align(4)
> hack also doesn't help on 64-bit ports...  I get flooded with these warnings
> on ia64 and alpha.
>
> I am going to unilaterally remove all __attribute__ ((align))s and quell
> this class of warning by removing -Wcast-align. After my rant I assume no
> one dares to object. :)
>
>
> _______________________________________________
> MLton mailing list
> MLton@...
> http://mlton.org/mailman/listinfo/mlton
>

_______________________________________________
MLton mailing list
MLton@...
http://mlton.org/mailman/listinfo/mlton