|
View:
New views
11 Messages
—
Rating Filter:
Alert me
|
|
|
Crashes with 64-bit native code generator on WindowsHi all,
We are still trying to figure out why our code crashes (brings up a Windows error message box saying that the application was terminated) when compiled with the native 64-bit codegen on Windows. We were able to break down the code a bit but unfortunately not enough to produce a small enough example that could be shared here. Removing more (mostly unrelated) code makes the crash go away. In our testing we have confirmed the following: - The crashes NEVER occur before the first FFI call. Removing all FFI calls makes the application work without crashes (as well as possible without the functionality that would be provided by the FFI calls). - Sometimes the crashes occur on the first FFI call, sometimes some time after the call (within ML code), sometimes on the second or third FFI call. This changes randomly depending on which code we include. - The crashes are not caused by our user code in the FFI functions. We have removed all code from the bodies of those functions, leaving only a simple return statement. - Our FFI DLLs do not have entry functions that would be called when the DLL is loaded. - The crashes do not occur if MLton's C code generator is used We were able to create an example that only uses a single FFI call and crashes on the first call to that function. I have consolidated the (partially ml-nlffigen generated) code and listed it below. Please let me know if you find any problem in the code. Please don't mind the useless conversions in the "cp" function within "foo". In the real code they are partially within some compatibility wrapper code and removing them completely makes the crash go away. I can not see why these conversions should cause a crash. The code below does causes the crash when called from within our (large) code. It does not produce a crash when called within a small example. As mentioned, this is the only FFI call that is actually called by the code. We do have to include another function making an FFI call in order to make the crash happen. However, that call is never executed before the crash. It could be executed some time later. If that second FFI code is not present, the crash does not happen. My question basically is this: do you have any suggestions on how to debug this any further? Any MLton command-line options for debugging? Are there any optimization passes that we should try to disable? Do you know of any caveats that we might have missed when creating our DLLs? Any suggestions are welcome. Best regards, David C code: ------- __declspec(dllexport) int __stdcall foo(const char *p, int f, const void *buf, int sz) { return 1; } ML code: -------- structure F_foo = struct val lib = DynLinkage.open_lib {name = "bar.dll", lazy=true, global=false} val h = DynLinkage.lib_symbol (lib, "foo") val callop = _import * : CMemory.addr -> CMemory.cc_addr * CMemory.cc_sint * CMemory.cc_addr * CMemory.cc_sint -> CMemory.cc_sint; fun mkcall a (x1, x2, x3, x4) = C_Int.Cvt.c_sint (CMemory.unwrap_sint (callop a (CMemory.wrap_addr (C_Int.reveal (C_Int.Ptr.inject' x1)), CMemory.wrap_sint (C_Int.Cvt.ml_sint x2), CMemory.wrap_addr (C_Int.reveal x3), CMemory.wrap_sint (C_Int.Cvt.ml_sint x4)))) fun f' (x1 : C_Int.ro C_Int.uchar_obj C_Int.ptr', x2 : MLRep.Int.Signed.int, x3 : C_Int.voidptr, x4 : MLRep.Int.Signed.int) : MLRep.Int.Signed.int = C_Int.Cvt.ml_sint (C_Int.call (C_Int.mk_fptr (mkcall, DynLinkage.addr h), (x1, C_Int.Cvt.c_sint x2, x3, C_Int.Cvt.c_sint x4))) end fun foo (pp : string, f : bool, c : Word8.word vector) : bool = let val _ = print "foo-start\n" val sz = Vector.length c val buf = C.alloc' C.S.uchar (Word.fromInt sz) fun cp (i, p) = if i >= sz then () else (C.Set.uchar' (C.Ptr.|*! p, Word8.fromLargeInt (Word32.toLargeInt (Word32.fromLargeInt (Word8.toLargeInt (Vector.sub (c, i)))))); cp (i+1, C.Ptr.|+! C.S.uchar (p, 1))) in cp (0, buf); F_foo.f' (C.Ptr.null', if f then 1 else 0, C.Ptr.inject' buf, Int32.fromInt sz); C.free' buf; print "foo-end\n"; true end -- ---------------------------------------------------------- David Hansel http://www.reactive-systems.com/ OpenPGP (GnuPG) public key file: http://www.reactive-systems.com/~hansel/pgp_public_key.txt ---------------------------------------------------------- _______________________________________________ MLton mailing list MLton@... http://mlton.org/mailman/listinfo/mlton |
|
|
Re: Crashes with 64-bit native code generator on WindowsSorry for the slow reply.
On Wed, Nov 11, 2009 at 5:10 AM, David Hansel <hansel@...> wrote: > The code below does causes the crash when called from within our (large) > code. It does not produce a crash when called within a small example. > As mentioned, this is the only FFI call that is actually called by > the code. We do have to include another function making an FFI call > in order to make the crash happen. However, that call is never executed > before the crash. It could be executed some time later. If that second > FFI code is not present, the crash does not happen. I went ahead and tried to build it. I compiled the C (bar.c) program using: x86_64-w64-mingw32-gcc -Wall -O2 -o bar.dll -shared -Wl,--out-implib,bar.a -Wl,--output-def,bar.def bar.c I wrote the following baz.mlb file: $(SML_LIB)/basis/basis.mlb $(SML_LIB)/mlnlffi-lib/mlnlffi-lib.mlb $(SML_LIB)/mlnlffi-lib/memory/memory.mlb $(SML_LIB)/mlnlffi-lib/internals/c-int.mlb ann "allowFFI true" in baz.sml end Then I compiled with: mlton -target x86_64-w64-mingw32 -link-opt -ldl -verbose 1 baz.mlb The resulting program worked. Are you using similar compile options? In the time since your last post have you perhaps found a more complete crash example? > My question basically is this: do you have any suggestions on how to > debug this any further? Any MLton command-line options for debugging? Well, there's -debug true, but gdb under 64-bit windows is so flakey I wouldn't bother trying that. In fact, the MLton.msi doesn't include the debug version of the runtime (it is over 200MB due to the windows debugging format), so you would need to build MLton from source to get the debug library. I doubt it would help you, though. > Are there any optimization passes that we should try to disable? I doubt this is an optimization problem. > Do you know of any caveats that we might have missed when creating our > DLLs? Ok, here are the things I can think of from the top of my head: 0) You're loading a 32-bit dll instead of a 64-bit one. Double check. 1) Windows might require a stack alignment that doesn't match the amd64 FFI codegen. Your program happens to end up with bad alignment, and my programs have just never been unlucky. You could declare a volatile local 64-bit variable and printf it's address in the C code. See if the offset of this variable fails to be 64-bit aligned (only) in the failing programs. 2) The __stdcall is confusing gcc. There is only one calling convention under win64. Try specifying nothing. However, I am guessing blind! Without a way to reproduce this I can't really help. I've used the FFI quite heavily under win64 in one of our recent projects without problems, so FFI definitely works most of the time. It's possible you've found a corner case, which can often be an alignment problem. Is the program really too secret to release the buggy part of its source code? MLton is free. ;) _______________________________________________ MLton mailing list MLton@... http://mlton.org/mailman/listinfo/mlton |
|
|
Re: Crashes with 64-bit native code generator on WindowsHello Wesley,
Wesley W. Terpstra wrote: > Sorry for the slow reply. > > On Wed, Nov 11, 2009 at 5:10 AM, David Hansel > <hansel@...> wrote: >> The code below does causes the crash when called from within our (large) >> code. It does not produce a crash when called within a small example. >> As mentioned, this is the only FFI call that is actually called by >> the code. We do have to include another function making an FFI call >> in order to make the crash happen. However, that call is never executed >> before the crash. It could be executed some time later. If that second >> FFI code is not present, the crash does not happen. > > I went ahead and tried to build it. > [...] > The resulting program worked. Are you using similar compile options? We are using Microsoft Visual C++ to create our DLL's so (just in case there was some obscure compiler setting that we were missing) I gave it a try and compiled our DLL with gcc -- which didn't change anything. Our MLton command line is: mlton @MLton gc-summary hash-cons 1.0 -- -target x86_64-w64-mingw32 -codegen native -profile no -profile-stack false -const 'Exn.keepHistory false' -drop-pass deepFlatten -link-opt -ldl -output foo.exe -verbose 2 foo.mlb I do not know exactly what the '-drop-pass deepFlatten' does but it was put in by Stephen Weeks back in 2006 when he assisted us in making our code compile with MLton. If I remember correctly there was a compiler performace issue. However, as you said before, optimization settings are probably not the problem here. > In the time since your last post have you perhaps found a more > complete crash example? Unfortunately no. I have been trying but the crash goes away anytime I cut down the code some more to produce a smaller example. >> My question basically is this: do you have any suggestions on how to >> debug this any further? Any MLton command-line options for debugging? > > Well, there's -debug true, but gdb under 64-bit windows is so flakey I > wouldn't bother trying that. In fact, the MLton.msi doesn't include > the debug version of the runtime (it is over 200MB due to the windows > debugging format), so you would need to build MLton from source to get > the debug library. I doubt it would help you, though. That's unfortunate. >> Are there any optimization passes that we should try to disable? > > I doubt this is an optimization problem. > >> Do you know of any caveats that we might have missed when creating our >> DLLs? > > Ok, here are the things I can think of from the top of my head: > 0) You're loading a 32-bit dll instead of a 64-bit one. Double check. Double- and triple-checked that. > 1) Windows might require a stack alignment that doesn't match the > amd64 FFI codegen. Your program happens to end up with bad alignment, > and my programs have just never been unlucky. You could declare a > volatile local 64-bit variable and printf it's address in the C code. > See if the offset of this variable fails to be 64-bit aligned (only) > in the failing programs. An alignment problem or something similar is what I suspect, too. Creating a local variable won't help because the process dies even before the first time it enters the code in the DLL, so any printf in there will not happen before the crash. > 2) The __stdcall is confusing gcc. There is only one calling > convention under win64. Try specifying nothing. I've tried with and without. No difference. > However, I am guessing blind! Without a way to reproduce this I can't > really help. I've used the FFI quite heavily under win64 in one of our > recent projects without problems, so FFI definitely works most of the > time. It's possible you've found a corner case, which can often be an > alignment problem. > Is the program really too secret to release the buggy part of its > source code? MLton is free. ;) It's good to hear that the FFI has been tested in win64. I completely understand about the guessing, we do have the same problems with our customer (our product not working with their code, can't send the code). Unfortunately since this is a commercial application and we do have to include a large part of our code to make the crash happen I can definitely not post the code to the list. If we can't figure this out otherwise we might be able to set up an NDA with you so we could send the code to you in private. One thing I can make available is the executable that actually experiences the crash as well as the MLton-produced assembly code. I don't know what kind of debugging tools you have available and whether that would be any help. Please let me know. There are two observations that I have made since my last post. They may or may not be related to the actual problem but I thought I'd mention them anyways: I was looking into what could be causing the problem and came across file MLton/lib/mlton/sml/mlnlffi-lib/memory/linkage-libdl.sml which is of course used by the FFI. I wasn't completely sure what the "era" deal in that code is, so I changed the body of function "get" to just "f()", resolving the FFI function's address before every call. After that change, all crashes were gone. Furthermore, changing the body of "get" to just "a" does NOT fix the crashes. That looked good so I added some "print" statements in "get" to see whether there is a problem with the address not being resolved properly. Unfortunately, just adding the "print" statements also made the crashes go away. In fact, just adding 'print "";' at the beginning of "get" eliminates the crashes. Interestingly, this eliminates the crashes completely. With other changes in our code I was able to eliminate some instances of the crashes but new ones would pop up at other places. I suspect that the proximity of this code to the actual FFI calls might play a role in that. I gave the "Debugging Tools for Windows" debugger a try and loaded the crashing executable there. With that, I was able to track the crash in our simplest example to the following assembly code: 00000000`0054b2c8 4c897df8 mov qword ptr [rbp-8],r15 00000000`0054b2cc 48892d3d408000 mov qword ptr [rsim4c_mlton!MLton_main+0x402901 (00000000`0080403d)],rbp 00000000`0054b2d3 4c892526408000 mov qword ptr [rsim4c_mlton!MLton_main+0x4028ea (00000000`00804026)],r12 00000000`0054b2da ff15683d8000 call qword ptr [rsim4c_mlton!MLton_main+0x40262c (00000000`00803d68)] ds:00000000`00d4f048=0000000000000000 Note the "=0" address at the end. The crash happens because the result address of the indirect call is 0, which could be some hint but I don't know how to look into this any further. Do you have a suggestion how to track this back to MLton's assembly output or even to the original ML code? Best regards, David -- ---------------------------------------------------------- David Hansel http://www.reactive-systems.com/ OpenPGP (GnuPG) public key file: http://www.reactive-systems.com/~hansel/pgp_public_key.txt ---------------------------------------------------------- _______________________________________________ MLton mailing list MLton@... http://mlton.org/mailman/listinfo/mlton |
|
|
Re: Crashes with 64-bit native code generator on WindowsHi again,
I have a few more observations regarding the crashes that might ring a bell for some of the MLton developers involved in the native code-generators and/or FFI interface: 1) In my previous post I included a disassembly of the location where the crash happens. With some creative grep-ing I was able to find the location of that code within the assembly code that MLton produces for our program: _L_176132: movq (_c_stackP+0x0)(%rip),%rsp movq 0x40(%rbp),%r14 movq 0x0(%r14),%r13 movq 0x0(%r13),%r11 movl %r15d,%r9d xorq %r8,%r8 movl $0x0,%r15d movl %r15d,%edx xorq %rcx,%rcx subq $0x20,%rsp addq $0x40,%rbp leaq (_L_176133+0x0)(%rip),%r15 movq %r15,0xFFFFFFFFFFFFFFF8(%rbp) movq %rbp,(_gcState+0x10)(%rip) movq %r12,(_gcState+0x0)(%rip) call *(_applyFFTempFun+0x0)(%rip) <<------- CRASH addq $0x20,%rsp movq (_gcState+0x0)(%rip),%r12 movq (_gcState+0x10)(%rip),%rbp jmp _L_176133 I was able to reproduce this with several examples for which the crash occurs (all of which unfortunately include a large part of our code so I can not make them available here). The crash always occurs in the "*applyFFTempFun" call and always because applyFFTempFun is NULL. I am reasonably sure that this actually is the location of the crash and not just some similar looking code because if I comment out the "call" statement in the assembly code and then compile the executable from the assembly code, the crash goes away (the target of the FFI call in question is an empty function that does not do anything so it is not surprising that skipping the call does not produce other problems). Also, if I introduce an infinite loop right before the call, the compiled program hangs right before where I would usually observe the crash. 2) As I mentioned before, if I compile the program from the SML code and just insert a "print" statement in function "get" within MLton/lib/mlton/sml/mlnlffi-lib/memory/linkage-libdl.sml, the crash also does not occur. Interestingly, the MLton-produced assembly code for that version (only change is the "print" statement) does not contain ANY calls to "applyFFTempFun". 3) Looking at the MLton source code (amd64-generate-transfers.fun), I can see that calls to "applyFFTempFun" seem to be inserted for "Indirect" FFI calls. I do not know enough about the code generator or the FFI interface to make much sense out of this. However, I can see that the MLTon-produced code with the crash only contains a call to "applyFFTempFun" (which I assume is created in line 1566 of file amd64-generate-transfers.fun) but never any code that would set the value of "applyFFTempFun" (which I assume should be created in line 1183 of file amd64-generate-transfers.fun). Given these observations, does anyone have any suggestions about MLton debugging options or other ways to shed more light on what might be going wrong here? Thanks, David _______________________________________________ MLton mailing list MLton@... http://mlton.org/mailman/listinfo/mlton |
|
|
Re: Crashes with 64-bit native code generator on WindowsOn Mon, Nov 23, 2009 at 8:39 PM, David Hansel <hansel@...> wrote:
I was looking into what could be causing the problem and came across The "era" is supposed to invalidated dynamically loaded library addresses if the executable is started up again after saving the world (MLton.World.save). Because it is a new executable invocation, the dynamically linked library needs to be reloaded and it might end up at a different address. This isn't actually done in the linkage-libdl.sml code; see the commented out "Cleaner.addNew" application. I don't recall why it is disabled. In any case, unless you are saving and loading worlds, it shouldn't affect your code. After This, and your next email, suggest that it is a bug with the native codegen. The probable role that the proximity of the "print" call plays is that there will be an C function call invoked by the "print", which "resets" the register allocator. Without the "print" call, there is a wider scope over which the register allocator is able to work, and, apparently, is mistakenly dropping a def. _______________________________________________ MLton mailing list MLton@... http://mlton.org/mailman/listinfo/mlton |
|
|
Re: Crashes with 64-bit native code generator on WindowsOn Mon, Nov 30, 2009 at 11:00 AM, David Hansel <hansel@...> wrote:
1) In my previous post I included a disassembly of the location where the I agree that that seems to pinpoint the source of the crash. 2) As I mentioned before, if I compile the program from the SML More evidence. Although, in this case, I suspect that there is still an indirect call in the assembly code. It simply doesn't go through the temporary variable --- gets allocated and stays in a register. 3) Looking at the MLton source code (amd64-generate-transfers.fun), I can see Sounds like a bug in the amd64 codegen simplifier and/or register allocator. It seems that somewhere along the line, the definition of the applyFFTempFun variable is being dropped, but the use in the indirect call is being retained. When the register allocator comes along, when it doesn't locally find the def point of applyFFTempFun, it has to fetch the value from the (uninitialized) variable. Could you compile with "-native-commented 3 -native-split 0 -keep g" and post the basic block that has the call through applyFFTempFun? It will be pretty noisy, but should shed some light on what the native codegen is doing (wrong). _______________________________________________ MLton mailing list MLton@... http://mlton.org/mailman/listinfo/mlton |
|
|
Re: Crashes with 64-bit native code generator on WindowsHi Matthew,
Matthew Fluet wrote: > [...] > Sounds like a bug in the amd64 codegen simplifier and/or register > allocator. It seems that somewhere along the line, the definition of > the applyFFTempFun variable is being dropped, but the use in the > indirect call is being retained. When the register allocator comes > along, when it doesn't locally find the def point of applyFFTempFun, it > has to fetch the value from the (uninitialized) variable. > > Could you compile with "-native-commented 3 -native-split 0 -keep g" and > post the basic block that has the call through applyFFTempFun? It will > be pretty noisy, but should shed some light on what the native codegen > is doing (wrong). See the code below. It should match up with the code I posted before. >From what I can tell it does look like MLton puts the target address for applyFFTempFun into a register but then later does the indirect call via the memory location. Please let me know if you need any more context or other debugging information. It does seem like you are on the right track. Thanks! David /* Live: (SW64(24): ExnStack, SW32(40): Word32, SP(64): Objptr (opt_1516), SP(48): Objptr (opt_36)) */ /* begin: RP(0): Objptr (opt_22) = OP (SP(64): Objptr (opt_1516), 0): Objptr (opt_22) */ /* end: RP(0): Objptr (opt_22) = OP (SP(64): Objptr (opt_1516), 0): Objptr (opt_22) */ /* begin: RQ(0): CPointer = OQ (RP(0): Objptr (opt_22), 0): CPointer */ /* end: RQ(0): CPointer = OQ (RP(0): Objptr (opt_22), 0): CPointer */ /* CCall {args = (RQ(0): CPointer, NULL, 0x0, NULL, SW32(40): Word32), frameInfo = Some {frameLayoutsIndex = 1072}, func = {args = (CPointer, CPointer, Word32, CPointer, Word32), bytesNeeded = None, convention = cdecl, ensuresBytesFree = false, mayGC = true, maySwitchThreads = false, modifiesFrontier = true, prototype = {args = (CPointer, Int32, CPointer, Int32), res = Some Int32}, readsStackTop = true, return = Word32, symbolScope = external, target = <*>, writesStackTop = true}, return = Some L_176133} */ /* begin ccall: cdecl <*> */ /* CCALL cdecl <*>(MEM<q>{Heap}[(MEM<q>{Heap}[(MEM<q>{Stack}[(MEM<q>{GCStateHold}[((_gcState+0x10))+(0x0)])+(0x40)])+(0x0)])+(0x0)], $0x0, $0x0, $0x0, MEM<l>{Stack}[(MEM<q>{GCStateHold}[((_gcState+0x10))+(0x0)])+(0x28)]) <Some _L_176133> */ /* ************************************************************ */ /* Cache: caches: MEM<q>{StaticNonTemp}[(_c_stackP)+(0x0)] -> %rsp (reserved) */ movq (_c_stackP+0x0)(%rip),%rsp /* ************************************************************ */ /* movq MEM<q>{Heap}[(MEM<q>{Heap}[(MEM<q>{Stack}[(MEM<q>{GCStateHold}[((_gcState+0x10))+(0x0)])+(0x40)])+(0x0)])+(0x0)],MEM<q>{CArg}[(_applyFFTempFun)+(0x0)] */ movq 0x40(%rbp),%r14 movq 0x0(%r14),%r13 movq 0x0(%r13),%r11 /* ************************************************************ */ /* movzlq MEM<l>{Stack}[(MEM<q>{GCStateHold}[((_gcState+0x10))+(0x0)])+(0x28)],MEM<q>{CArg}[(_applyFFTempRegArg)+(0x0)] */ movl %r15d,%r9d /* ************************************************************ */ /* Cache: caches: MEM<q>{CArg}[(_applyFFTempRegArg)+(0x0)] -> %r9 (reserved) */ /* ************************************************************ */ /* movq $0x0,MEM<q>{CArg}[(_applyFFTempRegArg)+(0x8)] */ xorq %r8,%r8 /* ************************************************************ */ /* Cache: caches: MEM<q>{CArg}[(_applyFFTempRegArg)+(0x8)] -> %r8 (reserved) */ /* ************************************************************ */ /* movzlq $0x0,MEM<q>{CArg}[(_applyFFTempRegArg)+(0x10)] */ movl $0x0,%r15d movl %r15d,%edx /* ************************************************************ */ /* Cache: caches: MEM<q>{CArg}[(_applyFFTempRegArg)+(0x10)] -> %rdx (reserved) */ /* ************************************************************ */ /* movq $0x0,MEM<q>{CArg}[(_applyFFTempRegArg)+(0x18)] */ xorq %rcx,%rcx /* ************************************************************ */ /* Cache: caches: MEM<q>{CArg}[(_applyFFTempRegArg)+(0x18)] -> %rcx (reserved) */ /* ************************************************************ */ /* subq $0x20,MEM<q>{StaticNonTemp}[(_c_stackP)+(0x0)] */ subq $0x20,%rsp /* ************************************************************ */ /* Force: commit_memlocs: commit_classes: remove_memlocs: remove_classes: dead_memlocs: dead_classes: */ /* ************************************************************ */ /* addq $0x40,MEM<q>{GCStateHold}[((_gcState+0x10))+(0x0)] */ addq $0x40,%rbp /* ************************************************************ */ /* leaq MEM<q>{Code}[(_L_176133)+(0x0)],MEM<q>{Stack}[(MEM<q>{GCStateHold}[((_gcState+0x10))+(0x0)])+(0xFFFFFFFFFFFFFFF8)] */ leaq (_L_176133+0x0)(%rip),%r15 movq %r15,0xFFFFFFFFFFFFFFF8(%rbp) movq %rbp,(_gcState+0x10)(%rip) /* ************************************************************ */ /* Force: commit_memlocs: MEM<q>{Stack}[(MEM<q>{GCStateHold}[((_gcState+0x10))+(0x0)])+(0xFFFFFFFFFFFFFFF8)] commit_classes: remove_memlocs: remove_classes: dead_memlocs: dead_classes: */ /* ************************************************************ */ /* Force: commit_memlocs: commit_classes: GCStateVolatile GCState CStatic Globals Stack Heap Code CStack remove_memlocs: remove_classes: dead_memlocs: dead_classes: */ /* ************************************************************ */ /* Force: commit_memlocs: commit_classes: GCStateVolatile GCStateHold GCState Globals Stack Heap remove_memlocs: remove_classes: dead_memlocs: dead_classes: */ movq %r12,(_gcState+0x0)(%rip) /* ************************************************************ */ /* CCall */ /* ************************************************************ */ /* call *MEM<q>{CArg}[(_applyFFTempFun)+(0x0)] */ call *(_applyFFTempFun+0x0)(%rip) /* ************************************************************ */ /* XmmUnreserve: registers: */ /* ************************************************************ */ /* Unreserve: registers: %rcx %rdx %r8 %r9 */ /* ************************************************************ */ /* Force: commit_memlocs: commit_classes: remove_memlocs: remove_classes: dead_memlocs: dead_classes: GCStateVolatile GCStateHold GCState Globals Stack Heap */ /* ************************************************************ */ /* Return: [(%eax,MEM<l>{StaticTemp}[(_cReturnTemp)+(0x0)])] */ /* ************************************************************ */ /* addq $0x20,MEM<q>{StaticNonTemp}[(_c_stackP)+(0x0)] */ addq $0x20,%rsp /* ************************************************************ */ /* Unreserve: registers: %rsp */ /* ************************************************************ */ /* Cache: caches: MEM<q>{GCStateHold}[((_gcState+0x0))+(0x0)] -> %r12 (reserved) MEM<q>{GCStateHold}[((_gcState+0x10))+(0x0)] -> %rbp (reserved) */ movq (_gcState+0x0)(%rip),%r12 movq (_gcState+0x10)(%rip),%rbp /* ************************************************************ */ /* XmmCache: caches: */ /* ************************************************************ */ /* Cache: caches: MEM<l>{StaticTemp}[(_cReturnTemp)+(0x0)] -> %eax (reserved) */ /* ************************************************************ */ /* Force: commit_memlocs: commit_classes: GCStateVolatile GCState CStatic Globals Stack Heap Code CStack remove_memlocs: remove_classes: dead_memlocs: dead_classes: */ /* ************************************************************ */ /* jmp _L_176133 */ jmp _L_176133 _______________________________________________ MLton mailing list MLton@... http://mlton.org/mailman/listinfo/mlton |
|
|
Re: Crashes with 64-bit native code generator on WindowsOn Mon, Nov 30, 2009 at 12:45 PM, David Hansel <hansel@...> wrote:
See the code below. It should match up with the code I posted before. Yes, it looks like it gets dropped into %r11, but not used from that location. Please let me know if you need any more context or other debugging We need to find out when the codegen loses track of the fact that %r11 has applyFFTempFun. Could you compile with "-native-commented 6"? That's the most debugging information that we can get from a precompiled MLton binary. It produces a *lot* of debugging information (in the form of comments in the assembly). Rather than posting to the mailing list, I suggest posting the basic block to http://mlton.org/TemporaryUpload. -Matthew _______________________________________________ MLton mailing list MLton@... http://mlton.org/mailman/listinfo/mlton |
|
|
Re: Crashes with 64-bit native code generator on WindowsHello Matthew,
I tried using "-native-commented 6" but (due to the size of the code involved) compilation (in the "outputAssembly" stage) seems to take a VERY long time. I also tried "-native-commented 5" with the same result. A setting of "4" worked much faster and I have uploaded a file hansel-20091130-1.s containing the basic block. I will try the others again but the "4" setting produced about 1200 .s output files and with the "6" setting MLton produced the first (.0.s) output file and then I stopped it after about an hour of not producing any more output. I will try again but unless the process gets much faster after the second file I don't think we'll get to output file #913 in any reasonable amount of time. Are there ways to restrict the additional output to specific parts of the code? Unfortunately, I can't cut down the code much without the problem going away. David Matthew Fluet wrote: > On Mon, Nov 30, 2009 at 12:45 PM, David Hansel > <hansel@... <mailto:hansel@...>> wrote: > > See the code below. It should match up with the code I posted before. > From what I can tell it does look like MLton puts the target address for > applyFFTempFun into a register but then later does the indirect call via > the memory location. > > > Yes, it looks like it gets dropped into %r11, but not used from that > location. > > > Please let me know if you need any more context or other debugging > information. It does seem like you are on the right track. > > > We need to find out when the codegen loses track of the fact that %r11 > has applyFFTempFun. Could you compile with "-native-commented 6"? > That's the most debugging information that we can get from a precompiled > MLton binary. It produces a *lot* of debugging information (in the form > of comments in the assembly). Rather than posting to the mailing list, > I suggest posting the basic block to http://mlton.org/TemporaryUpload. > > -Matthew -- ---------------------------------------------------------- David Hansel Chief Technology Officer -- Reactive Systems, Inc. http://www.reactive-systems.com/ (919) 324-3507 ext. 102 -- hansel@... OpenPGP (GnuPG) public key file: http://www.reactive-systems.com/~hansel/pgp_public_key.txt ---------------------------------------------------------- _______________________________________________ MLton mailing list MLton@... http://mlton.org/mailman/listinfo/mlton |
|
|
Re: Crashes with 64-bit native code generator on WindowsOn Mon, Nov 30, 2009 at 4:19 PM, David Hansel <hansel@...> wrote:
I tried using "-native-commented 6" but (due to the size of the code involved) That seems to be enough to provide a hint. I think that the issue is that the function address got placed in %r11, which is a caller save register. The contents of caller save registers are pushed to memory immediately before the call instruction, for any register whose content is live after the call and purged from the register allocation. Of course, the function address is still live *at* the call instruction, although it is not live after the call instruction. Small examples seem to favor %r15 as the register into which the function address is placed, which is not caller save, and so not susceptible to this issue. It also fits with small changes near the indirect function call eliminating the segfault; such changes alter the liveness and used registers and presumably the function address get stored in a non-caller save register. If this is indeed the source of the issue, then it is simply a native amd64 codegen bug (and, possibly, a latent x86 codegen bug as well) and is independent of the target OS; that is, it is not mingw specific. _______________________________________________ MLton mailing list MLton@... http://mlton.org/mailman/listinfo/mlton |
|
|
Re: Crashes with 64-bit native code generator on WindowsHi,
Just in case someone comes across this thread in the future: Matthew Fluet fixed the problem that was causing these crashes in r7368. Many thanks, David Matthew Fluet wrote: > On Mon, Nov 30, 2009 at 4:19 PM, David Hansel > <hansel@... <mailto:hansel@...>> wrote: > > I tried using "-native-commented 6" but (due to the size of the code > involved) > compilation (in the "outputAssembly" stage) seems to take a VERY > long time. > I also tried "-native-commented 5" with the same result. A setting > of "4" > worked much faster and I have uploaded a file hansel-20091130-1.s > containing > the basic block. > > > That seems to be enough to provide a hint. I think that the issue is > that the function address got placed in %r11, which is a caller save > register. The contents of caller save registers are pushed to memory > immediately before the call instruction, for any register whose content > is live after the call and purged from the register allocation. Of > course, the function address is still live *at* the call instruction, > although it is not live after the call instruction. Small examples seem > to favor %r15 as the register into which the function address is placed, > which is not caller save, and so not susceptible to this issue. It also > fits with small changes near the indirect function call eliminating the > segfault; such changes alter the liveness and used registers and > presumably the function address get stored in a non-caller save > register. If this is indeed the source of the issue, then it is simply > a native amd64 codegen bug (and, possibly, a latent x86 codegen bug as > well) and is independent of the target OS; that is, it is not mingw > specific. > -- ---------------------------------------------------------- David Hansel http://www.reactive-systems.com/ OpenPGP (GnuPG) public key file: http://www.reactive-systems.com/~hansel/pgp_public_key.txt ---------------------------------------------------------- _______________________________________________ MLton mailing list MLton@... http://mlton.org/mailman/listinfo/mlton |
| Free embeddable forum powered by Nabble | Forum Help |