pthread_join problem

View: New views
2 Messages — Rating Filter:   Alert me  

pthread_join problem

by Stefan Eilemann :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hello,

I am in the situation that a pthread_join does not return, even
though the thread has called pthread_exit.

I read the cleanup notes, but I think it does not apply here.

I am using the C cleanup code. One thread calls pthread_exit,
the other phtread_join. I've verified that the thread calling
pthread_exit does the longjmp to the thread start code, which
calls _endthreadex.

The main thread calling pthread_join does hang in
WaitForMultipleObjects.

The problem only occurs when I am using some unrelated(?)
external code (the Mellanox SDP Infiniband implementation),
so it could be caused by that, or just be a race appearing
with this code.

There are other pthreads in my application, which terminate
correctly with pthread_exit/pthread_join. Only one thread -
the network receive thread ;)- does exhibit the problem.

Do you have an idea what could be the cause of this problem?
Anything else I could try to find the problem?


Best Regards,

Stefan.

PS: I've tested the Win64 version, and it works like a charm.
--
http://www.equalizergraphics.com
http://www.linkedin.com/in/eilemann




Re: pthread_join problem

by Ross Johnson-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Stefan Eilemann wrote:

> Hello,
>
> I am in the situation that a pthread_join does not return, even
> though the thread has called pthread_exit.
>
> I read the cleanup notes, but I think it does not apply here.
>
> I am using the C cleanup code. One thread calls pthread_exit,
> the other phtread_join. I've verified that the thread calling
> pthread_exit does the longjmp to the thread start code, which
> calls _endthreadex.
>
> The main thread calling pthread_join does hang in
> WaitForMultipleObjects.
>
> The problem only occurs when I am using some unrelated(?)
> external code (the Mellanox SDP Infiniband implementation),
> so it could be caused by that, or just be a race appearing
> with this code.
>
> There are other pthreads in my application, which terminate
> correctly with pthread_exit/pthread_join. Only one thread -
> the network receive thread ;)- does exhibit the problem.
>
> Do you have an idea what could be the cause of this problem?
> Anything else I could try to find the problem?
One thing that comes to mind that seems to fit the evidence assumes that
the external code that you mentioned is a DLL and it executes it's own
dllMain routine which somehow interferes with pthread-win32's thread
exit cleanup. This would be occurring after _endthreadex() is called,
which you've verified is called.

I don't know how Win32 determines which and in what order these
dllMain's are called (is it the order the DLLs are loaded?), but
pthreads-win32 does rely on this mechanism to do some final cleanup and
status setting for each 'POSIX' thread, and if this doesn't get done I
imagine it's possible you would see symptoms like this.

Pthreads-win32's dllMain() calls pthread_win32_thread_detach_np() in
pthread_win32_attach_detach_np.c. To verify that this is happening you
could set up a thread-specific data key and give it a destructor
routine, have your problem thread set it to a non-null value, and then
see if the destructor routine is called.

Regards.
Ross
>
> Best Regards,
>
> Stefan.
>
> PS: I've tested the Win64 version, and it works like a charm.