|
View:
New views
10 Messages
—
Rating Filter:
Alert me
|
|
|
apr 1.4: testpoll crash on OSX 10.6"./tests/testall testpoll" segfaults for me consistently on OSX 10.6.1
with the latest code from the 1.4-stable branch (64-bit APR library). gdb info: #0 0x000000010000e9b7 in send0_pollset (tc=0x7fff5fbfef80, data=0x0) at testpoll.c:389 389 ABTS_PTR_EQUAL(tc, s[0], descs[0].desc.s); (gdb) bt #0 0x000000010000e9b7 in send0_pollset (tc=0x7fff5fbfef80, data=0x0) at testpoll.c:389 #1 0x0000000100001456 in abts_run_test (ts=0x100200190, f=0x10000e925 <send0_pollset>, value=0x0) at abts.c:168 #2 0x000000010000f713 in testpoll (suite=0x100200190) at testpoll.c:685 #3 0x0000000100001e35 in main (argc=2, argv=0x7fff5fbff020) at abts.c:424 (gdb) p descs $1 = (const apr_pollfd_t *) 0x0 (gdb) p s[0] $2 = (apr_socket_t *) 0x100804240 (gdb) l 384 rv = apr_pollset_poll(pollset, 0, &num, &descs); 385 ABTS_INT_EQUAL(tc, APR_SUCCESS, rv); 386 ABTS_INT_EQUAL(tc, 1, num); 387 ABTS_PTR_NOTNULL(tc, descs); 388 389 ABTS_PTR_EQUAL(tc, s[0], descs[0].desc.s); 390 ABTS_PTR_EQUAL(tc, s[0], descs[0].client_data); 391 } 392 393 static void recv0_pollset(abts_case *tc, void *data) It crashes about 50% of the time, both with and without threads enabled. Can anyone reproduce this? Neil |
|
|
Re: apr 1.4: testpoll crash on OSX 10.6What about 32 bit?
I will try to reproduce... On Oct 14, 2009, at 1:02 PM, Neil Conway wrote: > "./tests/testall testpoll" segfaults for me consistently on OSX 10.6.1 > with the latest code from the 1.4-stable branch (64-bit APR library). > gdb info: > > #0 0x000000010000e9b7 in send0_pollset (tc=0x7fff5fbfef80, data=0x0) > at testpoll.c:389 > 389 ABTS_PTR_EQUAL(tc, s[0], descs[0].desc.s); > (gdb) bt > #0 0x000000010000e9b7 in send0_pollset (tc=0x7fff5fbfef80, data=0x0) > at testpoll.c:389 > #1 0x0000000100001456 in abts_run_test (ts=0x100200190, f=0x10000e925 > <send0_pollset>, value=0x0) at abts.c:168 > #2 0x000000010000f713 in testpoll (suite=0x100200190) at testpoll.c: > 685 > #3 0x0000000100001e35 in main (argc=2, argv=0x7fff5fbff020) at > abts.c:424 > (gdb) p descs > $1 = (const apr_pollfd_t *) 0x0 > (gdb) p s[0] > $2 = (apr_socket_t *) 0x100804240 > (gdb) l > 384 rv = apr_pollset_poll(pollset, 0, &num, &descs); > 385 ABTS_INT_EQUAL(tc, APR_SUCCESS, rv); > 386 ABTS_INT_EQUAL(tc, 1, num); > 387 ABTS_PTR_NOTNULL(tc, descs); > 388 > 389 ABTS_PTR_EQUAL(tc, s[0], descs[0].desc.s); > 390 ABTS_PTR_EQUAL(tc, s[0], descs[0].client_data); > 391 } > 392 > 393 static void recv0_pollset(abts_case *tc, void *data) > > It crashes about 50% of the time, both with and without threads > enabled. Can anyone reproduce this? > > Neil > |
|
|
Re: apr 1.4: testpoll crash on OSX 10.6On Wed, Oct 14, 2009 at 12:02 PM, Neil Conway <nrc@...> wrote:
> "./tests/testall testpoll" segfaults for me consistently on OSX 10.6.1 > with the latest code from the 1.4-stable branch (64-bit APR library). > gdb info: > > #0 0x000000010000e9b7 in send0_pollset (tc=0x7fff5fbfef80, data=0x0) > at testpoll.c:389 > 389 ABTS_PTR_EQUAL(tc, s[0], descs[0].desc.s); > (gdb) bt > #0 0x000000010000e9b7 in send0_pollset (tc=0x7fff5fbfef80, data=0x0) > at testpoll.c:389 > #1 0x0000000100001456 in abts_run_test (ts=0x100200190, f=0x10000e925 > <send0_pollset>, value=0x0) at abts.c:168 > #2 0x000000010000f713 in testpoll (suite=0x100200190) at testpoll.c:685 > #3 0x0000000100001e35 in main (argc=2, argv=0x7fff5fbff020) at abts.c:424 > (gdb) p descs > $1 = (const apr_pollfd_t *) 0x0 > (gdb) p s[0] > $2 = (apr_socket_t *) 0x100804240 > (gdb) l > 384 rv = apr_pollset_poll(pollset, 0, &num, &descs); > 385 ABTS_INT_EQUAL(tc, APR_SUCCESS, rv); > 386 ABTS_INT_EQUAL(tc, 1, num); > 387 ABTS_PTR_NOTNULL(tc, descs); > 388 > 389 ABTS_PTR_EQUAL(tc, s[0], descs[0].desc.s); > 390 ABTS_PTR_EQUAL(tc, s[0], descs[0].client_data); > 391 } > 392 > 393 static void recv0_pollset(abts_case *tc, void *data) > > It crashes about 50% of the time, both with and without threads > enabled. Can anyone reproduce this? I've been able to reproduce the crash on my OSX 10.6.1 64bit mini and also a freebsd 6.4 amd64 machine. The following backtrace is from the freebsd box. The problem seems to be intermittent confirming Neil's observation. #0 0x000000000041240e in send0_pollset (tc=0x7fffffffdfd0, data=0x59b1c0) at testpoll.c:389 389 ABTS_PTR_EQUAL(tc, s[0], descs[0].desc.s); [New LWP 100273] (gdb) bt #0 0x000000000041240e in send0_pollset (tc=0x7fffffffdfd0, data=0x59b1c0) at testpoll.c:389 #1 0x0000000000406771 in abts_run_test (ts=0x589720, f=0x412380 <send0_pollset>, value=0x0) at ./abts.c:168 #2 0x0000000000413006 in testpoll (suite=0x596060) at testpoll.c:685 #3 0x0000000000407095 in main (argc=5371472, argv=0x7fffffffe098) at ./abts.c:424 Regards, Ryan |
|
|
Re: apr 1.4: testpoll crash on OSX 10.6On 10/17/2009 05:50 AM, Ryan Phillips wrote: > On Wed, Oct 14, 2009 at 12:02 PM, Neil Conway <nrc@...> wrote: >> "./tests/testall testpoll" segfaults for me consistently on OSX 10.6.1 >> with the latest code from the 1.4-stable branch (64-bit APR library). >> gdb info: >> >> #0 0x000000010000e9b7 in send0_pollset (tc=0x7fff5fbfef80, data=0x0) >> at testpoll.c:389 >> 389 ABTS_PTR_EQUAL(tc, s[0], descs[0].desc.s); >> (gdb) bt >> #0 0x000000010000e9b7 in send0_pollset (tc=0x7fff5fbfef80, data=0x0) >> at testpoll.c:389 >> #1 0x0000000100001456 in abts_run_test (ts=0x100200190, f=0x10000e925 >> <send0_pollset>, value=0x0) at abts.c:168 >> #2 0x000000010000f713 in testpoll (suite=0x100200190) at testpoll.c:685 >> #3 0x0000000100001e35 in main (argc=2, argv=0x7fff5fbff020) at abts.c:424 >> (gdb) p descs >> $1 = (const apr_pollfd_t *) 0x0 >> (gdb) p s[0] >> $2 = (apr_socket_t *) 0x100804240 What is the value of num? >> (gdb) l >> 384 rv = apr_pollset_poll(pollset, 0, &num, &descs); >> 385 ABTS_INT_EQUAL(tc, APR_SUCCESS, rv); >> 386 ABTS_INT_EQUAL(tc, 1, num); >> 387 ABTS_PTR_NOTNULL(tc, descs); >> 388 >> 389 ABTS_PTR_EQUAL(tc, s[0], descs[0].desc.s); >> 390 ABTS_PTR_EQUAL(tc, s[0], descs[0].client_data); >> 391 } >> 392 >> 393 static void recv0_pollset(abts_case *tc, void *data) >> Regards Rüdiger |
|
|
Re: apr 1.4: testpoll crash on OSX 10.6On Sat, Oct 17, 2009 at 2:40 AM, Ruediger Pluem <rpluem@...> wrote:
> > > On 10/17/2009 05:50 AM, Ryan Phillips wrote: >> On Wed, Oct 14, 2009 at 12:02 PM, Neil Conway <nrc@...> wrote: >>> "./tests/testall testpoll" segfaults for me consistently on OSX 10.6.1 >>> with the latest code from the 1.4-stable branch (64-bit APR library). >>> gdb info: >>> >>> #0 0x000000010000e9b7 in send0_pollset (tc=0x7fff5fbfef80, data=0x0) >>> at testpoll.c:389 >>> 389 ABTS_PTR_EQUAL(tc, s[0], descs[0].desc.s); >>> (gdb) bt >>> #0 0x000000010000e9b7 in send0_pollset (tc=0x7fff5fbfef80, data=0x0) >>> at testpoll.c:389 >>> #1 0x0000000100001456 in abts_run_test (ts=0x100200190, f=0x10000e925 >>> <send0_pollset>, value=0x0) at abts.c:168 >>> #2 0x000000010000f713 in testpoll (suite=0x100200190) at testpoll.c:685 >>> #3 0x0000000100001e35 in main (argc=2, argv=0x7fff5fbff020) at abts.c:424 >>> (gdb) p descs >>> $1 = (const apr_pollfd_t *) 0x0 >>> (gdb) p s[0] >>> $2 = (apr_socket_t *) 0x100804240 > > What is the value of num? > >>> (gdb) l >>> 384 rv = apr_pollset_poll(pollset, 0, &num, &descs); >>> 385 ABTS_INT_EQUAL(tc, APR_SUCCESS, rv); >>> 386 ABTS_INT_EQUAL(tc, 1, num); >>> 387 ABTS_PTR_NOTNULL(tc, descs); >>> 388 >>> 389 ABTS_PTR_EQUAL(tc, s[0], descs[0].desc.s); >>> 390 ABTS_PTR_EQUAL(tc, s[0], descs[0].client_data); >>> 391 } >>> 392 >>> 393 static void recv0_pollset(abts_case *tc, void *data) >>> > > Regards > > Rüdiger > Num on the freebsd machine is 0. (gdb) f 0 #0 0x000000000041240e in send0_pollset (tc=0x7fffffffdfd0, data=0x59b1c0) at testpoll.c:389 389 ABTS_PTR_EQUAL(tc, s[0], descs[0].desc.s); (gdb) l 384 rv = apr_pollset_poll(pollset, 0, &num, &descs); 385 ABTS_INT_EQUAL(tc, APR_SUCCESS, rv); 386 ABTS_INT_EQUAL(tc, 1, num); 387 ABTS_PTR_NOTNULL(tc, descs); 388 389 ABTS_PTR_EQUAL(tc, s[0], descs[0].desc.s); 390 ABTS_PTR_EQUAL(tc, s[0], descs[0].client_data); 391 } 392 393 static void recv0_pollset(abts_case *tc, void *data) (gdb) p rv $10 = 0 (gdb) p num $11 = 0 (gdb) p descs $12 = (const apr_pollfd_t *) 0x0 Regards, Ryan |
|
|
Re: apr 1.4: testpoll crash on OSX 10.6On 10/17/2009 11:58 PM, Ryan Phillips wrote: > On Sat, Oct 17, 2009 at 2:40 AM, Ruediger Pluem <rpluem@...> wrote: >> >> On 10/17/2009 05:50 AM, Ryan Phillips wrote: >>> On Wed, Oct 14, 2009 at 12:02 PM, Neil Conway <nrc@...> wrote: >>>> "./tests/testall testpoll" segfaults for me consistently on OSX 10.6.1 >>>> with the latest code from the 1.4-stable branch (64-bit APR library). >>>> gdb info: >>>> >>>> #0 0x000000010000e9b7 in send0_pollset (tc=0x7fff5fbfef80, data=0x0) >>>> at testpoll.c:389 >>>> 389 ABTS_PTR_EQUAL(tc, s[0], descs[0].desc.s); >>>> (gdb) bt >>>> #0 0x000000010000e9b7 in send0_pollset (tc=0x7fff5fbfef80, data=0x0) >>>> at testpoll.c:389 >>>> #1 0x0000000100001456 in abts_run_test (ts=0x100200190, f=0x10000e925 >>>> <send0_pollset>, value=0x0) at abts.c:168 >>>> #2 0x000000010000f713 in testpoll (suite=0x100200190) at testpoll.c:685 >>>> #3 0x0000000100001e35 in main (argc=2, argv=0x7fff5fbff020) at abts.c:424 >>>> (gdb) p descs >>>> $1 = (const apr_pollfd_t *) 0x0 >>>> (gdb) p s[0] >>>> $2 = (apr_socket_t *) 0x100804240 >> What is the value of num? >> >>>> (gdb) l >>>> 384 rv = apr_pollset_poll(pollset, 0, &num, &descs); >>>> 385 ABTS_INT_EQUAL(tc, APR_SUCCESS, rv); >>>> 386 ABTS_INT_EQUAL(tc, 1, num); >>>> 387 ABTS_PTR_NOTNULL(tc, descs); >>>> 388 >>>> 389 ABTS_PTR_EQUAL(tc, s[0], descs[0].desc.s); >>>> 390 ABTS_PTR_EQUAL(tc, s[0], descs[0].client_data); >>>> 391 } >>>> 392 >>>> 393 static void recv0_pollset(abts_case *tc, void *data) >>>> >> Regards >> >> Rüdiger >> > > Num on the freebsd machine is 0. > Thanks for that. I guess we have two problems here: 1. The crash: We simply should not execute the lines 389 and 390 if descs is NULL. Similar situations occur in various other parts of the test suite. We use ABTS_PTR_NOTNULL and continue afterwards and continue to use the pointer that failed ABTS_PTR_NOTNULL. So does this need to be fixed everywhere where this occurs? I guess a crash of the test program just because ABTS_PTR_NOTNULL failed is not acceptable. 2. If descs is NULL it means that the test failed as we have the ABTS_PTR_NOTNULL test in line 387. The question is: Why does this test fail? Regards Rüdiger |
|
|
Re: apr 1.4: testpoll crash on OSX 10.6Hi,
Ruediger Pluem schrieb: > 1. The crash: We simply should not execute the lines 389 and 390 if descs is NULL. > Similar situations occur in various other parts of the test suite. > We use ABTS_PTR_NOTNULL and continue afterwards and continue to use the pointer > that failed ABTS_PTR_NOTNULL. So does this need to be fixed everywhere where this > occurs? I guess a crash of the test program just because ABTS_PTR_NOTNULL failed > is not acceptable. yes, this needs fixing - I was already on similar track since some more tests fail for NetWare, and I get then crashes. I believe the reason why nobody else found this yet is simply because usually all tests succeed on Linux. BTW. I see also same with the testpoll on 32bit NetWare: sometime it runs, and then next round it fails. finally I think we also need to block some more things within the testsuite where we could determine from apr.h / apu.h if the feature to be tested is supported by the running platform at all. And these things should then not be marked as 'fail', but as 'skipped' or 'unsupported'. Gün. |
|
|
Re: apr 1.4: testpoll crash on OSX 10.6On a related note, ISTM that many of the tests for the poll / pollset
features are wrong in principle. They apparently assume that if you send a UDP datagram to localhost and then immediately poll() for it (with a timeout of zero), the poll() will pickup the UDP datagram you just sent. That is not a safe assumption, however (e.g. I see intermittent test failures due to this issue when using APR_POLLSET_POLL on OSX 10.6). Similarly, send_middle_pollset() assumes that if you send two datagrams and then poll(), the poll will return exactly two datagrams, whereas it might actually return 0, 1, or 2. And that's not even accounting for the possibility of UDP packet drops, which is possible even on localhost if the machine is under load. Neil On Sun, Oct 18, 2009 at 4:37 AM, Ruediger Pluem <rpluem@...> wrote: > > > On 10/17/2009 11:58 PM, Ryan Phillips wrote: >> On Sat, Oct 17, 2009 at 2:40 AM, Ruediger Pluem <rpluem@...> wrote: >>> >>> On 10/17/2009 05:50 AM, Ryan Phillips wrote: >>>> On Wed, Oct 14, 2009 at 12:02 PM, Neil Conway <nrc@...> wrote: >>>>> "./tests/testall testpoll" segfaults for me consistently on OSX 10.6.1 >>>>> with the latest code from the 1.4-stable branch (64-bit APR library). >>>>> gdb info: >>>>> >>>>> #0 0x000000010000e9b7 in send0_pollset (tc=0x7fff5fbfef80, data=0x0) >>>>> at testpoll.c:389 >>>>> 389 ABTS_PTR_EQUAL(tc, s[0], descs[0].desc.s); >>>>> (gdb) bt >>>>> #0 0x000000010000e9b7 in send0_pollset (tc=0x7fff5fbfef80, data=0x0) >>>>> at testpoll.c:389 >>>>> #1 0x0000000100001456 in abts_run_test (ts=0x100200190, f=0x10000e925 >>>>> <send0_pollset>, value=0x0) at abts.c:168 >>>>> #2 0x000000010000f713 in testpoll (suite=0x100200190) at testpoll.c:685 >>>>> #3 0x0000000100001e35 in main (argc=2, argv=0x7fff5fbff020) at abts.c:424 >>>>> (gdb) p descs >>>>> $1 = (const apr_pollfd_t *) 0x0 >>>>> (gdb) p s[0] >>>>> $2 = (apr_socket_t *) 0x100804240 >>> What is the value of num? >>> >>>>> (gdb) l >>>>> 384 rv = apr_pollset_poll(pollset, 0, &num, &descs); >>>>> 385 ABTS_INT_EQUAL(tc, APR_SUCCESS, rv); >>>>> 386 ABTS_INT_EQUAL(tc, 1, num); >>>>> 387 ABTS_PTR_NOTNULL(tc, descs); >>>>> 388 >>>>> 389 ABTS_PTR_EQUAL(tc, s[0], descs[0].desc.s); >>>>> 390 ABTS_PTR_EQUAL(tc, s[0], descs[0].client_data); >>>>> 391 } >>>>> 392 >>>>> 393 static void recv0_pollset(abts_case *tc, void *data) >>>>> >>> Regards >>> >>> Rüdiger >>> >> >> Num on the freebsd machine is 0. >> > > Thanks for that. > > I guess we have two problems here: > > 1. The crash: We simply should not execute the lines 389 and 390 if descs is NULL. > Similar situations occur in various other parts of the test suite. > We use ABTS_PTR_NOTNULL and continue afterwards and continue to use the pointer > that failed ABTS_PTR_NOTNULL. So does this need to be fixed everywhere where this > occurs? I guess a crash of the test program just because ABTS_PTR_NOTNULL failed > is not acceptable. > > 2. If descs is NULL it means that the test failed as we have the ABTS_PTR_NOTNULL > test in line 387. The question is: Why does this test fail? > > Regards > > Rüdiger > |
|
|
Re: apr 1.4: testpoll crash on OSX 10.6Attached is a patch against trunk that fixes this problem by changing
the test suite. For some of the tests, it is sufficient to change the poll() timeout from 0 (one-time poll) to -1 (blocking poll). In a few other places, the semantics of the test needed to be changed -- e.g. if we do a blocking poll() after sending two messages, we need to account for seeing either 1 or 2 messages. With these changes applied, OSX 10.6 passes testpoll reliably (for ~1000 local runs), using both the POLLSET_POLL and POLLSET_KQUEUE methods. Neil On Sat, Oct 24, 2009 at 7:30 PM, Neil Conway <nrc@...> wrote: > On a related note, ISTM that many of the tests for the poll / pollset > features are wrong in principle. They apparently assume that if you > send a UDP datagram to localhost and then immediately poll() for it > (with a timeout of zero), the poll() will pickup the UDP datagram you > just sent. That is not a safe assumption, however (e.g. I see > intermittent test failures due to this issue when using > APR_POLLSET_POLL on OSX 10.6). > > Similarly, send_middle_pollset() assumes that if you send two > datagrams and then poll(), the poll will return exactly two datagrams, > whereas it might actually return 0, 1, or 2. And that's not even > accounting for the possibility of UDP packet drops, which is possible > even on localhost if the machine is under load. > > Neil > > On Sun, Oct 18, 2009 at 4:37 AM, Ruediger Pluem <rpluem@...> wrote: >> >> >> On 10/17/2009 11:58 PM, Ryan Phillips wrote: >>> On Sat, Oct 17, 2009 at 2:40 AM, Ruediger Pluem <rpluem@...> wrote: >>>> >>>> On 10/17/2009 05:50 AM, Ryan Phillips wrote: >>>>> On Wed, Oct 14, 2009 at 12:02 PM, Neil Conway <nrc@...> wrote: >>>>>> "./tests/testall testpoll" segfaults for me consistently on OSX 10.6.1 >>>>>> with the latest code from the 1.4-stable branch (64-bit APR library). >>>>>> gdb info: >>>>>> >>>>>> #0 0x000000010000e9b7 in send0_pollset (tc=0x7fff5fbfef80, data=0x0) >>>>>> at testpoll.c:389 >>>>>> 389 ABTS_PTR_EQUAL(tc, s[0], descs[0].desc.s); >>>>>> (gdb) bt >>>>>> #0 0x000000010000e9b7 in send0_pollset (tc=0x7fff5fbfef80, data=0x0) >>>>>> at testpoll.c:389 >>>>>> #1 0x0000000100001456 in abts_run_test (ts=0x100200190, f=0x10000e925 >>>>>> <send0_pollset>, value=0x0) at abts.c:168 >>>>>> #2 0x000000010000f713 in testpoll (suite=0x100200190) at testpoll.c:685 >>>>>> #3 0x0000000100001e35 in main (argc=2, argv=0x7fff5fbff020) at abts.c:424 >>>>>> (gdb) p descs >>>>>> $1 = (const apr_pollfd_t *) 0x0 >>>>>> (gdb) p s[0] >>>>>> $2 = (apr_socket_t *) 0x100804240 >>>> What is the value of num? >>>> >>>>>> (gdb) l >>>>>> 384 rv = apr_pollset_poll(pollset, 0, &num, &descs); >>>>>> 385 ABTS_INT_EQUAL(tc, APR_SUCCESS, rv); >>>>>> 386 ABTS_INT_EQUAL(tc, 1, num); >>>>>> 387 ABTS_PTR_NOTNULL(tc, descs); >>>>>> 388 >>>>>> 389 ABTS_PTR_EQUAL(tc, s[0], descs[0].desc.s); >>>>>> 390 ABTS_PTR_EQUAL(tc, s[0], descs[0].client_data); >>>>>> 391 } >>>>>> 392 >>>>>> 393 static void recv0_pollset(abts_case *tc, void *data) >>>>>> >>>> Regards >>>> >>>> Rüdiger >>>> >>> >>> Num on the freebsd machine is 0. >>> >> >> Thanks for that. >> >> I guess we have two problems here: >> >> 1. The crash: We simply should not execute the lines 389 and 390 if descs is NULL. >> Similar situations occur in various other parts of the test suite. >> We use ABTS_PTR_NOTNULL and continue afterwards and continue to use the pointer >> that failed ABTS_PTR_NOTNULL. So does this need to be fixed everywhere where this >> occurs? I guess a crash of the test program just because ABTS_PTR_NOTNULL failed >> is not acceptable. >> >> 2. If descs is NULL it means that the test failed as we have the ABTS_PTR_NOTNULL >> test in line 387. The question is: Why does this test fail? >> >> Regards >> >> Rüdiger >> > |
|
|
Re: apr 1.4: testpoll crash on OSX 10.6On Sun, Oct 25, 2009 at 1:17 AM, Neil Conway <nrc@...> wrote:
> Attached is a patch against trunk that fixes this problem by changing > the test suite. For some of the tests, it is sufficient to change the > poll() timeout from 0 (one-time poll) to -1 (blocking poll). In a few > other places, the semantics of the test needed to be changed -- e.g. > if we do a blocking poll() after sending two messages, we need to > account for seeing either 1 or 2 messages. > > With these changes applied, OSX 10.6 passes testpoll reliably (for > ~1000 local runs), using both the POLLSET_POLL and POLLSET_KQUEUE > methods. > I tested this on OSX 10.6.1, Linux 2.6, and Freebsd 6.2... Works great. Thanks, Ryan |
| Free embeddable forum powered by Nabble | Forum Help |