latest MLton segfault in gmp

View: New views
17 Messages — Rating Filter:   Alert me  

latest MLton segfault in gmp

by Henry Cejtin :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Running  under  Debian  testing, I grabbed the latest MLton from SVN and
compiled it (using the last  official  MLton:  20070826)  on  an  AMD-64
machine.

The  make  couldn't finish because the resulting mlton-compile segfaults
compiling mllex.mlb.  In fact, that mlton-compile always segfaults, even
compiling  a  hello-world.sml.   The gdb traceback shows that it died in
__gmpz_mul_2exp().

The version of gmp that I have is 2:4.3.1+dfsg-3.

Any ideas or suggestions?

_______________________________________________
MLton mailing list
MLton@...
http://mlton.org/mailman/listinfo/mlton

Re: latest MLton segfault in gmp

by Matthew Fluet-5 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thu, 8 Oct 2009, Henry Cejtin wrote:

> Running  under  Debian  testing, I grabbed the latest MLton from SVN and
> compiled it (using the last  official  MLton:  20070826)  on  an  AMD-64
> machine.
>
> The  make  couldn't finish because the resulting mlton-compile segfaults
> compiling mllex.mlb.  In fact, that mlton-compile always segfaults, even
> compiling  a  hello-world.sml.   The gdb traceback shows that it died in
> __gmpz_mul_2exp().
>
> The version of gmp that I have is 2:4.3.1+dfsg-3.
>
> Any ideas or suggestions?

Can you run through the regression suite with mlton-20070826?  That would
determine if it is a pervasive problem with gmp on all programs with
IntInf.  There was a bug with the bytes-needed calculation for
IntInf.~>>, identified and fixed by Wesley in r7083,r7084,r7085:

--- a/basis-library/integer/int-inf0.sml
+++ b/basis-library/integer/int-inf0.sml
@@ -1203,7 +1203,7 @@ structure IntInf =
              if shift = 0wx0
                 then arg
                 else Prim.~>> (arg, shift,
-                              reserve (S.max (1, S.- (numLimbs arg, shiftSize shift)), 0))
+                              reserve (S.max (0, S.- (numLimbs arg, shiftSize shift)), 1))
        end

        fun mkBigCvt {base: Int32.int,

Perhaps there is a similar bug for IntInf.<<, though it looks o.k. to me.



_______________________________________________
MLton mailing list
MLton@...
http://mlton.org/mailman/listinfo/mlton

Re: latest MLton segfault in gmp

by Wesley W. Terpstra :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Fri, Oct 9, 2009 at 4:21 AM, Matthew Fluet <mtf@...> wrote:
On Thu, 8 Oct 2009, Henry Cejtin wrote:
The  make  couldn't finish because the resulting mlton-compile segfaults
compiling mllex.mlb.  In fact, that mlton-compile always segfaults, even
compiling  a  hello-world.sml.   The gdb traceback shows that it died in
__gmpz_mul_2exp().

The version of gmp that I have is 2:4.3.1+dfsg-3.

I have reproduced this problem. Then I copied my svn/HEAD compiler from sarge to squeeze and it segfaulted as well. Something in squeeze has changed. The MinGW32 port self-compiles using 4.3.1 and debian hasn't patched gmp in any relevant way.

 Can you run through the regression suite with mlton-20070826?

Here are the highlights:

testing flexrecord.2:
Type error: actual and formal not of same type
actual: ('a_4068 * nat) * (nat * nat)
formal: (nat * nat) * (nat * nat)
expression: ZZZ_f x_0
unhandled exception: TypeError
compilation of flexrecord.2 failed with -type-check true

real-algsimp (which appears to be a fix added around r6241).
1a2
> true
3,4c4
< false
< false
---
> true
... another new regression

Error: test-create.sml 42.53.
  Function applied to incorrect argument.
    expects: [Unix.exit_status]
    but got: [?.PosixProcess.exit_status]
    in: statusToString status
... also a fix since 2007.

testing thread-switch-share
1,2c1,2
< size1 >= size2 = true
< sum1 = sum2 = true
---
> ./bin/regression: line 28: 13263 Segmentation fault      "./$f"
> Nonzero exit status.
testing thread-switch-size
1c1,2
< !rs > 0 = true
---
> ./bin/regression: line 28: 13301 Segmentation fault      "./$f"
> Nonzero exit status.
... more bugs fixed since 2007.

testing weak.2
DeepFlatten.replaceVar global_0
compilation of weak.2 failed with -type-check true
... not sure about this one?


_______________________________________________
MLton mailing list
MLton@...
http://mlton.org/mailman/listinfo/mlton

Re: latest MLton segfault in gmp

by Matthew Fluet-5 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Fri, 9 Oct 2009, Wesley W. Terpstra wrote:

> On Fri, Oct 9, 2009 at 4:21 AM, Matthew Fluet <mtf@...> wrote:
>
>> On Thu, 8 Oct 2009, Henry Cejtin wrote:
>>
>>> The  make  couldn't finish because the resulting mlton-compile segfaults
>>> compiling mllex.mlb.  In fact, that mlton-compile always segfaults, even
>>> compiling  a  hello-world.sml.   The gdb traceback shows that it died in
>>> __gmpz_mul_2exp().
>>>
>>> The version of gmp that I have is 2:4.3.1+dfsg-3.
>>>
>>
> I have reproduced this problem. Then I copied my svn/HEAD compiler from
> sarge to squeeze and it segfaulted as well. Something in squeeze has
> changed. The MinGW32 port self-compiles using 4.3.1 and debian hasn't
> patched gmp in any relevant way.

My x86-darwin (sing MacPorts gmp) also has 4.3.1, without problems.

> Can you run through the regression suite with mlton-20070826?
>>
>
> Here are the highlights:
>
> testing flexrecord.2:
> Type error: actual and formal not of same type
> actual: ('a_4068 * nat) * (nat * nat)
> formal: (nat * nat) * (nat * nat)
> expression: ZZZ_f x_0
> unhandled exception: TypeError
> compilation of flexrecord.2 failed with -type-check true

Test introduced by r6216 (20071127) and subsequently fixed.

> testing weak.2
> DeepFlatten.replaceVar global_0
> compilation of weak.2 failed with -type-check true

Test introduced by r6189 (20071120) and subsequently fixed.

So, no additional gmp bugs.  But, gmp reallocation bugs aren't always so
obvious, because we always give it to the end of the heap (even if we
limit checked for a smaller amount), so there is no error unless we really
are running into the end of the heap.  Could you try the regressions again
with '-debug true'?  That should throw an assertion if there really isn't
enough room in the heap.  You could also try to compile HEAD with 20070826
and '-debug true', which might be more revealing, since the resulting
mlton-compile always segfaults.

Of course, I'm surprised that you and Henry see the same behavior.  If it
is a bad limit check (due to a miscalculation of needed bytes for an
IntInf result), then it is usually so dependent upon the exact sequence of
GCs, which in turn is very sensitive to available memory and the exact
contents of the heap (which includes things like the string representation
of the path to the current executable, likely to be of different sizes on
different machines), that it is nearly impossible to recreate on another
machine.


_______________________________________________
MLton mailing list
MLton@...
http://mlton.org/mailman/listinfo/mlton

Re: latest MLton segfault in gmp

by Henry Cejtin :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I  tried  the  stock AMD-64 version of MLton 20070826 on all the current int-
inf.* things and got no errors.

I'll compile the latest SVN with -debug true and see  if  that  is  any  more
revealing.

_______________________________________________
MLton mailing list
MLton@...
http://mlton.org/mailman/listinfo/mlton

Re: latest MLton segfault in gmp

by Wesley W. Terpstra :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Fri, Oct 9, 2009 at 5:16 PM, Matthew Fluet <mtf@...> wrote:
But, gmp reallocation bugs aren't always so obvious

It's not a gmp reallocation bug. Here's what I know so far:
  * The input argument is '1' and the shift is by 128
  * We have 67 reserve bytes indicated
  * There is enough room for 33037511 more limbs, so not at the heap end.
  * The argument is allocated on the stack
  * The same parameters work several times before the segfault
  * gdb shows that the target IntInf has been filled correctly
  * /proc/*/maps show the memory address is in a valid range
  * It is dying on the MPN_ZERO line in mpz/mul_2exp.c
  * The memory is only 4-byte aligned at the point of failure

I've tried compiling with -align 8 and then it works... I'm not sure this is a solution, though; it may have just masked the problem.

Can you see if adding -align 8 to mlton/Makefile fixes it for you as well, Henry?


_______________________________________________
MLton mailing list
MLton@...
http://mlton.org/mailman/listinfo/mlton

Re: latest MLton segfault in gmp

by Wesley W. Terpstra :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sat, Oct 10, 2009 at 10:27 PM, Wesley W. Terpstra <wesley@...> wrote:
I've tried compiling with -align 8 and then it works... I'm not sure this is a solution, though; it may have just masked the problem.

Found the smoking gun! Debian builds gmp with -O3 whereas I used -O2 for MinGW32. If you look at the assembler output of mpz/mul_exp.c with the two options you will notice a difference... the introduction of a 'movdqa' instruction, which is an SSE2 instruction that expects 16-byte alignment.

From what I've read, an array of 64-bit words should be 64-bit aligned. MLton IntInfs are such arrays and must thus be 8-byte aligned. They aren't.

Here's the problem vectorized assembler from gcc with -O3 (I've marked the problem code):

.LVL16:
        andl    $15, %eax
        shrq    $3, %rax
^^^^^^^^^^^ This ignores the 4-byte alignment of the array, only caring about it's 8-byte alignment before it moves on to doing 16-byte aligned moves.
        cmpq    %r12, %rax
        cmova   %r12, %rax
        testq   %rax, %rax
        je      .L10
.LBB2:
        cmpq    %rax, %r12
        movq    $0, (%r14)
        leaq    8(%r14), %rdi
        leaq    -1(%r12), %rsi
        je      .L8
.L10:
        movq    %r12, %rbx
        subq    %rax, %rbx
        movq    %rbx, %rcx
        shrq    %rcx
        movq    %rcx, %r9
        addq    %r9, %r9
        je      .L16
        pxor    %xmm0, %xmm0
        leaq    (%r14,%rax,8), %r8
        xorl    %edx, %edx
        .p2align 4,,10
        .p2align 3
.L12:
        .loc 1 64 0
        movq    %rdx, %rax
        addq    $1, %rdx
        salq    $4, %rax
        cmpq    %rcx, %rdx
        movdqa  %xmm0, (%r8,%rax)
^^^^^^^^^^^^^^^^^^^^^^^^^ At this point the memory MUST be 16-byte aligned, but isn't if the input is 4-byte aligned +8 -> 12!=0 mod 16. This causes our segfault.
        jb      .L12
        subq    %r9, %rsi
        cmpq    %r9, %rbx
        leaq    (%rdi,%r9,8), %rdi
        je      .L8

What's the plan going forward? align(AMD64) == 8?


_______________________________________________
MLton mailing list
MLton@...
http://mlton.org/mailman/listinfo/mlton

Re: latest MLton segfault in gmp

by Matthew Fluet-5 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sat, 10 Oct 2009, Wesley W. Terpstra wrote:
> On Sat, Oct 10, 2009 at 10:27 PM, Wesley W. Terpstra <wesley@...>wrote:
>
>> I've tried compiling with -align 8 and then it works... I'm not sure this
>> is a solution, though; it may have just masked the problem.
>
> Found the smoking gun! Debian builds gmp with -O3 whereas I used -O2 for
> MinGW32. If you look at the assembler output of mpz/mul_exp.c with the two
> options you will notice a difference... the introduction of a 'movdqa'
> instruction, which is an SSE2 instruction that expects 16-byte alignment.
...
> What's the plan going forward? align(AMD64) == 8?

Nice detective work.  However, from what you have described, it doesn't
seem as though defaulting to '-align 8' on amd64{-linux?} is sufficient.
That will guarantee that all objects are aligned on 8-byte boundaries, but
it won't guarantee that IntInf arrays are aligned on 16-byte boundaries.
Going to '-align 16' would waste quite a bit more space.

On the other hand, I don't see how the gmp.h header guarantees that a
mp_limb_t* is 16-byte aligned.  It is simply a pointer to a 64-bit
integer, so it seems that gcc can only assume that the allocated object
pointed to by a mp_limb_t* is 8-byte aligned.  Am I misunderstanding the
assembly?  Is it dynamically checking the alignment and using the SSE2
instructions only if it happens to be 16-byte aligned?  In which case,
then '-align 8' would be reasonable.


_______________________________________________
MLton mailing list
MLton@...
http://mlton.org/mailman/listinfo/mlton

Re: latest MLton segfault in gmp

by Wesley W. Terpstra :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sun, Oct 11, 2009 at 3:59 PM, Matthew Fluet <mtf@...> wrote:
On Sat, 10 Oct 2009, Wesley W. Terpstra wrote:
On Sat, Oct 10, 2009 at 10:27 PM, Wesley W. Terpstra <wesley@...>wrote:

I've tried compiling with -align 8 and then it works... I'm not sure this
is a solution, though; it may have just masked the problem.

Found the smoking gun! Debian builds gmp with -O3 whereas I used -O2 for
MinGW32. If you look at the assembler output of mpz/mul_exp.c with the two
options you will notice a difference... the introduction of a 'movdqa'
instruction, which is an SSE2 instruction that expects 16-byte alignment.
...

What's the plan going forward? align(AMD64) == 8?

However, from what you have described, it doesn't seem as though defaulting to '-align 8' on amd64{-linux?} is sufficient.

No, 8 alignment is sufficient. That assembler from gcc assumes it was 8 byte aligned and then promotes it to 16-byte alignment by testing the 3rd bit and optionally doing a single 8-byte operation before moving on to 16-byte operations.

 
... it won't guarantee that IntInf arrays are aligned on 16-byte boundaries

Unnecessary.

On the other hand, I don't see how the gmp.h header guarantees that a mp_limb_t* is 16-byte aligned.

It doesn't.
 
It is simply a pointer to a 64-bit integer, so it seems that gcc can only assume that the allocated object pointed to by a mp_limb_t* is 8-byte aligned.

That's exactly what it assumes.
 
Am I misunderstanding the assembly?  Is it dynamically checking the alignment and using the SSE2 instructions only if it happens to be 16-byte aligned?

It dynamically checks the alignment and does a single 8-byte operation if necessary to move forward onto 16-byte aligned values.
 


_______________________________________________
MLton mailing list
MLton@...
http://mlton.org/mailman/listinfo/mlton

Re: latest MLton segfault in gmp

by Wesley W. Terpstra :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Before we change MLton's default alignment, I think there is another question that needs to be answered: is gcc's behaviour correct? I assume you (Matthew) read the amd64 ABI. What exactly are the rules for alignment of pointers/word64s? I assumed that they can be 4-byte aligned on amd64, but perhaps being inside an array requires they be 8-byte aligned?

If gcc is correct to assume that arrays containing word64s should be 8-byte aligned, then we should definitely move to -align 8 on all amd64 targets. However, if gcc is mistaken, then we should probably file a bug against gcc and leave -align 4 as is.


_______________________________________________
MLton mailing list
MLton@...
http://mlton.org/mailman/listinfo/mlton

Re: latest MLton segfault in gmp

by Florian Weimer :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

* Wesley W. Terpstra:

> Before we change MLton's default alignment, I think there is another
> question that needs to be answered: is gcc's behaviour correct? I assume you
> (Matthew) read the amd64 ABI. What exactly are the rules for alignment of
> pointers/word64s? I assumed that they can be 4-byte aligned on amd64, but
> perhaps being inside an array requires they be 8-byte aligned?

8-byte alignment is preferable for performance reasons (so GCC follows
that), but it's not enforced by the hardware.

_______________________________________________
MLton mailing list
MLton@...
http://mlton.org/mailman/listinfo/mlton

Re: latest MLton segfault in gmp

by Wesley W. Terpstra :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, Oct 13, 2009 at 8:20 PM, Florian Weimer <fw@...> wrote:
8-byte alignment is preferable for performance reasons (so GCC follows
that)

Sure, that's clear. gcc is welcome to emit aligned output. The question is if it is justified to assume that all other modules likewise align their output. That's what it did and that's why we crash.


_______________________________________________
MLton mailing list
MLton@...
http://mlton.org/mailman/listinfo/mlton

Re: latest MLton segfault in gmp

by Florian Weimer :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

* Wesley W. Terpstra:

> Sure, that's clear. gcc is welcome to emit aligned output. The question is
> if it is justified to assume that all other modules likewise align their
> output. That's what it did and that's why we crash.

The AMD64 psABI supplement requires 8-byte alignment.  And the end of
the input area on the stack must be aligned to a 16 byte boundary upon
function entry. See <http://www.x86-64.org/documentation/abi-0.99.pdf>.

_______________________________________________
MLton mailing list
MLton@...
http://mlton.org/mailman/listinfo/mlton

Re: latest MLton segfault in gmp

by Wesley W. Terpstra :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Oct 14, 2009 at 7:23 AM, Florian Weimer <fw@...> wrote:
The AMD64 psABI supplement requires 8-byte alignment.  And the end of
the input area on the stack must be aligned to a 16 byte boundary upon
function entry. See <http://www.x86-64.org/documentation/abi-0.99.pdf>.

The stack was actually already aligned, just not the GMP limb array in MLton's heap.

Still, if requiring this 8-byte alignment is sanctioned, we should just go with -align 8 and be safe.

Since Matthew is the AMD64 expert I'm mostly waiting to hear his opinion before I commit a fix.


_______________________________________________
MLton mailing list
MLton@...
http://mlton.org/mailman/listinfo/mlton

Re: latest MLton segfault in gmp

by Matthew Fluet-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I'm hardly an expert.  I used the www.x86-64.org document to implement
the C calling convention in the native codegen, but didn't peruse it
much otherwise.

Searching for "align" in the document, though, reveals that on page
12, it declares that  {,signed,unsigned} {,long} long  all have 8-byte
alignment.  However, on the next page it states:

 Like the Intel386 architecture, the AMD64 architecture in general does not re-
 quire all data accesses to be properly aligned. Misaligned data
accesses are slower
 than aligned accesses but otherwise behave identically. The only exceptions are
 that __m128 and __m256 must always be aligned properly.

So, it isn't clear to me that one really needs to 8-align 64-bit integers.

In the next subsection (p. 13), on aggregates and unions, it states:

 An array uses the same alignment as its elements, except that a local or global
 array variable of length at least 16 bytes or a C99 variable-length
array variable
 always has alignment of at least 16 bytes.

This seems to suggest that gcc is within its rights to assume that an
array of {,long} long-s is 16-byte aligned (if it has at least 2
elements).  MLton would have a hard time supporting that short of
implementing '-align 16'.

I suggest that going with '-align 8' (as Wesley committed earlier
today) is a reasonable way to go.

I did want to also point out that there is a legacy issue, I would
assume, on Debian.  Since mlton-20070826 is dynamically linked against
libgmp, isn't it just an incredible luck of the draw that a
self-compile with mlton-20070826 didn't happen to produce a
non-16-byte aligned IntInf array.

On Wed, Oct 14, 2009 at 5:57 AM, Wesley W. Terpstra <wesley@...> wrote:

> On Wed, Oct 14, 2009 at 7:23 AM, Florian Weimer <fw@...> wrote:
>>
>> The AMD64 psABI supplement requires 8-byte alignment.  And the end of
>> the input area on the stack must be aligned to a 16 byte boundary upon
>> function entry. See <http://www.x86-64.org/documentation/abi-0.99.pdf>.
>
> The stack was actually already aligned, just not the GMP limb array in
> MLton's heap.
>
> Still, if requiring this 8-byte alignment is sanctioned, we should just go
> with -align 8 and be safe.
>
> Since Matthew is the AMD64 expert I'm mostly waiting to hear his opinion
> before I commit a fix.
>
>
> _______________________________________________
> MLton mailing list
> MLton@...
> http://mlton.org/mailman/listinfo/mlton
>

_______________________________________________
MLton mailing list
MLton@...
http://mlton.org/mailman/listinfo/mlton

Parent Message unknown Re: latest MLton segfault in gmp

by Wesley W. Terpstra :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Oct 14, 2009 at 11:56 PM, Matthew Fluet <matthew.fluet@...> wrote:
I'm hardly an expert.  I used the www.x86-64.org document to implement
the C calling convention in the native codegen, but didn't peruse it
much otherwise.

Nice link, thanks.
 
Searching for "align" in the document, though, reveals that on page
12, it declares that  {,signed,unsigned} {,long} long  all have 8-byte
alignment.

Ok, that table is pretty clear. The ABI defines that Word64s must be 8-byte aligned. Therefore gcc was within it's rights to assume that the pointer was 8-byte aligned and the bug was ours.
 
 However, on the next page it states: 

 Like the Intel386 architecture, the AMD64 architecture in general
does not re-
 quire all data accesses to be properly aligned. Misaligned data
accesses are slower
 than aligned accesses but otherwise behave identically. The only
exceptions are
 that __m128 and __m256 must always be aligned properly.

This is not a contradiction. Architecture != ABI. The machine can do it, but the ABI forbids it.

So, it isn't clear to me that one really needs to 8-align 64-bit integers.

If we want to link with any other application code ... libc, libgmp, ffi, .... then it's 100% clear we need to do 8-byte alignment. We have just been lucky that no other software actually made use of the 8-byte alignment guarantee until now (since few architectural limitations actually trip over an ABI violation).
 
In the next subsection (p. 13), on aggregates and unions, it states:

 An array uses the same alignment as its elements, except that a
local or global array variable of length at least 16 bytes or a C99 variable-length
array variable always has alignment of at least 16 bytes.

I think by global/local arrays they mean arrays not in the heap but the data segment. (local = static int64_t foo[4];, global = extern int64_t foo[4];)

At any rate, this sounds like we don't need to worry because MLton only passes arrays as pointers (both FFI and GMP limb structure).

I did want to also point out that there is a legacy issue, I would
assume, on Debian.  Since mlton-20070826 is dynamically linked against
libgmp, isn't it just an incredible luck of the draw that a
self-compile with mlton-20070826 didn't happen to produce a
non-16-byte aligned IntInf array.

Yes, I was surprised too. However there are a couple reasons this worked out. First, the only code gcc managed to vectorize in the gmp C is the MPN_ZERO method. Second, the only place MPN_ZERO gets called (for us) is to clear the low bits of a left-shifted intinf. Third, it won't use 16-byte writes unless there are 16-bytes to write, so it had to be a >=128-bit left shift. I wonder if these maybe didn't happen in 20070826?

I imagine that as gcc gets smarter, vectorizing more code, this will become a more serious legacy issue.


_______________________________________________
MLton mailing list
MLton@...
http://mlton.org/mailman/listinfo/mlton

Re: latest MLton segfault in gmp

by Matthew Fluet-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Oct 14, 2009 at 6:30 PM, Wesley W. Terpstra <wesley@...> wrote:

> On Wed, Oct 14, 2009 at 11:56 PM, Matthew Fluet <matthew.fluet@...>
> wrote:
>> In the next subsection (p. 13), on aggregates and unions, it states:
>>
>>  An array uses the same alignment as its elements, except that a
>> local or global array variable of length at least 16 bytes or a C99
>> variable-length
>> array variable always has alignment of at least 16 bytes.
>
> I think by global/local arrays they mean arrays not in the heap but the data
> segment. (local = static int64_t foo[4];, global = extern int64_t foo[4];)
>
> At any rate, this sounds like we don't need to worry because MLton only
> passes arrays as pointers (both FFI and GMP limb structure).

Agreed.  I missed that it was "array *variable* of length at least 16
bytes", not any array (memory object).

>> I did want to also point out that there is a legacy issue, I would
>> assume, on Debian.  Since mlton-20070826 is dynamically linked against
>> libgmp, isn't it just an incredible luck of the draw that a
>> self-compile with mlton-20070826 didn't happen to produce a
>> non-16-byte aligned IntInf array.
>
> Yes, I was surprised too. However there are a couple reasons this worked
> out. First, the only code gcc managed to vectorize in the gmp C is the
> MPN_ZERO method. Second, the only place MPN_ZERO gets called (for us) is to
> clear the low bits of a left-shifted intinf. Third, it won't use 16-byte
> writes unless there are 16-bytes to write, so it had to be a >=128-bit left
> shift. I wonder if these maybe didn't happen in 20070826?

I'll bet it is r6253 (20071209):
  http://mlton.org/cgi-bin/viewsvn.cgi?rev=6253&view=rev
That limits constant folding to IntInfs in the range -2^128 -- 2^128,
and introduces an explicit calculation of 2^128 as IntInf.<<(1,0w128).

_______________________________________________
MLton mailing list
MLton@...
http://mlton.org/mailman/listinfo/mlton