Problem with the setitimer() emulation for MinGW

View: New views
4 Messages — Rating Filter:   Alert me  

Problem with the setitimer() emulation for MinGW

by Nicolas Bertolotti-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Some parts of this message have been removed. Learn more about Nabble's security policy.

Hello,

 

I am facing strange crashes of the MinGW version of my application when using the TimeLimit structure which relies on the setitimer implementation (MLton compiler based on the current SVN sources).

 

When I run a function using TimeLimit, the process actually crashes about 10 seconds after the timer is initialized before the TimeLimit call completes (even if the function I run through TimeLimit requires much less than 10 seconds to complete).

 

I noticed that the issue disappears if I comment out the SetThreadPriority() call in the fixPriority() function (file runtime/platform/mingw.c) and call :

SetThreadPriority(GetCurrentThread(), THREAD_PRIORITY_BELOW_NORMAL);

instead from the main thread.

 

I can’t find out a reason why the system would let us change the priority of the timer thread (the function returns TRUE) but adopt a strange behavior after than but it definitely seems to be the case.

 

Any idea ? (This is not a blocking issue)

 

Thanks

 

Nicolas

 

cid:image001.gif@01C7BFD3.87CF8F80

 

Accelerating the pace of  engineering and science

Nicolas Bertolotti
Senior Development Engineer

2 Rue de Paris
92196 Meudon Cedex

France

Nicolas.Bertolotti@...

tel:
fax:
mobile:

+33.1.41.14.88.55

+33.1.55.64.06.64

+33.6.86.41.87.15

 



_______________________________________________
MLton mailing list
MLton@...
http://mlton.org/mailman/listinfo/mlton

Re: Problem with the setitimer() emulation for MinGW

by Wesley W. Terpstra :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thu, Aug 27, 2009 at 5:06 PM, Nicolas Bertolotti <Nicolas.Bertolotti@...> wrote:
I am facing strange crashes of the MinGW version of my application when using the TimeLimit structure

What is the "TimeLimit" structure? MLton exposes access to this functionality only via the MLton.Itimer structure.
 
which relies on the setitimer implementation (MLton compiler based on the current SVN sources).

Can you provide a small demonstration program? Which version of Windows?

When I run a function using TimeLimit, the process actually crashes about 10 seconds after the timer is initialized before the TimeLimit call completes (even if the function I run through TimeLimit requires much less than 10 seconds to complete).

Is 10 seconds a special value in your program?

I noticed that the issue disappears if I comment out the SetThreadPriority() call in the fixPriority() function (file runtime/platform/mingw.c) and call :

SetThreadPriority(GetCurrentThread(), THREAD_PRIORITY_BELOW_NORMAL);

instead from the main thread.

Well, removing the line completely will just decrease the responsiveness of the timers. Setting the main thread to higher priority will decrease responsiveness even further.

Is it possible that this decreased responsiveness masks some phenomenon in your program which relates to this awfully special sounding 10s? A race condition perhaps?

I can’t find out a reason why the system would let us change the priority of the timer thread (the function returns TRUE) but adopt a strange behavior after than but it definitely seems to be the case.

I'm unconvinced that Windows would have a bug like this.

I get the vague impression that you might be using the MLton implementation of setitimer directly... If that is the case, it probably is also the source of your problems. The "signal handler" run by setitimer does not run in the main thread, greatly restricting what one can safely do. MLton itself only uses the handle to set a flag. I suggest you do the same.


_______________________________________________
MLton mailing list
MLton@...
http://mlton.org/mailman/listinfo/mlton

RE: Problem with the setitimer() emulation for MinGW

by Nicolas Bertolotti-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Some parts of this message have been removed. Learn more about Nabble's security policy.

TimeLimit is included in smlnj-lib/Util. It uses the structure Engine (same location) which itself relies on MLton.Itimer.

 

10 seconds means nothing for my program. The time limit I set is 30 minutes.

 

The computation we perform in the application is pure mathematics (no system calls expect to read or write some files to the disk, no dependency against some particular FFI functions …). There is no direct use of the setitimer() function and all the system calls that are performed internally by MLton.

 

Please note that I also could not manage to extract a sample program that reproduces the issue (I experienced it using Windows XP) … it would have been too easy.

 

From: Wesley W. Terpstra [mailto:wesley@...]
Sent: Monday, August 31, 2009 12:39 AM
To: Nicolas Bertolotti
Cc: mlton@...
Subject: Re: [MLton] Problem with the setitimer() emulation for MinGW

 

On Thu, Aug 27, 2009 at 5:06 PM, Nicolas Bertolotti <Nicolas.Bertolotti@...> wrote:

I am facing strange crashes of the MinGW version of my application when using the TimeLimit structure


What is the "TimeLimit" structure? MLton exposes access to this functionality only via the MLton.Itimer structure.
 

which relies on the setitimer implementation (MLton compiler based on the current SVN sources).


Can you provide a small demonstration program? Which version of Windows?

When I run a function using TimeLimit, the process actually crashes about 10 seconds after the timer is initialized before the TimeLimit call completes (even if the function I run through TimeLimit requires much less than 10 seconds to complete).

Is 10 seconds a special value in your program?

I noticed that the issue disappears if I comment out the SetThreadPriority() call in the fixPriority() function (file runtime/platform/mingw.c) and call :

SetThreadPriority(GetCurrentThread(), THREAD_PRIORITY_BELOW_NORMAL);

instead from the main thread.

Well, removing the line completely will just decrease the responsiveness of the timers. Setting the main thread to higher priority will decrease responsiveness even further.

Is it possible that this decreased responsiveness masks some phenomenon in your program which relates to this awfully special sounding 10s? A race condition perhaps?

I can’t find out a reason why the system would let us change the priority of the timer thread (the function returns TRUE) but adopt a strange behavior after than but it definitely seems to be the case.

I'm unconvinced that Windows would have a bug like this.

I get the vague impression that you might be using the MLton implementation of setitimer directly... If that is the case, it probably is also the source of your problems. The "signal handler" run by setitimer does not run in the main thread, greatly restricting what one can safely do. MLton itself only uses the handle to set a flag. I suggest you do the same.


_______________________________________________
MLton mailing list
MLton@...
http://mlton.org/mailman/listinfo/mlton

Re: Problem with the setitimer() emulation for MinGW

by Wesley W. Terpstra :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Mon, Aug 31, 2009 at 10:34 AM, Nicolas Bertolotti <Nicolas.Bertolotti@...> wrote:

TimeLimit is included in smlnj-lib/Util. It uses the structure Engine (same location) which itself relies on MLton.Itimer.

Ahh, I see. I have never used smlnj-lib myself.

10 seconds means nothing for my program. The time limit I set is 30 minutes.

Is the time it takes to fail the same on computers with different processing speeds? ie: on a computer twice as slow will it crash in 20 seconds or still 10 seconds?

The computation we perform in the application is pure mathematics

This is very odd. It reproducibly fails after 10s each time? What is the failure message anyway?

One way to debug the problem might be to watch what system calls the program is doing right before it dies. There are a few tools for windows that work like 'strace' in Unix. I've found them useful in debugging problems like this in my own programs, but I don't recall the name of the one I used most recently. I think it may have been straceNT [1].

Please note that I also could not manage to extract a sample program that reproduces the issue (I experienced it using Windows XP) … it would have been too easy. 
 
If you could try slowly eliminating your program until the problem disappears, we might find what the interaction is.

[1] <http://www.intellectualheaven.com/default.asp?BH=projects&H=strace.htm>

_______________________________________________
MLton mailing list
MLton@...
http://mlton.org/mailman/listinfo/mlton