bjam bug invoking codeworker

View: New views
8 Messages — Rating Filter:   Alert me  

bjam bug invoking codeworker

by Fabien Chêne :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hello,

I encounter a bug invoking codeworker in an 'action', bjam is
mysteriously stucked.
It used to work with an older version (M11, Boost.Jam 3.1.14), and
does not work anymore at least on svn trunk.

I have attached a reduced testcase that demonstrates the issue,
reproductible at least on x86_64-unknown-linux-gnu, with the gcc
toolset.

The executable 'codeworker' can be found here:
http://codeworker.free.fr, section Download, then compiled from
sources.

Thanks in advance for any help.

--
Fab


_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost-build

bug-bjam-codeworker.tar.bz2 (558 bytes) Download Attachment

Re: bjam bug invoking codeworker

by Fabien Chêne :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

2009/10/8 Fabien CHÊNE <fabien.chene@...>:
> Hello,
>
> I encounter a bug invoking codeworker in an 'action', bjam is
> mysteriously stucked.
> It used to work with an older version (M11, Boost.Jam 3.1.14), and
> does not work anymore at least on svn trunk.

I am trying to debug bjam, but i can't build a debug version of bjam,
build.sh --debug seems to be broken, a JamBase file is requested and
is no longer present in the build directory:

./build.sh --debug
###
### Using 'gcc' toolset.
###
rm -rf bootstrap
mkdir bootstrap
gcc -o bootstrap/jam0 command.c compile.c debug.c expand.c glob.c
hash.c hdrmacro.c headers.c jam.c jambase.c jamgram.c lists.c make.c
make1.c newstr.c option.c output.c parse.c pathunix.c pathvms.c
regexp.c rules.c scan.c search.c subst.c timestamp.c variable.c
modules.c strings.c filesys.c builtins.c pwd.c class.c native.c md5.c
w32_getreg.c modules/set.c modules/path.c modules/regex.c
modules/property-set.c modules/sequence.c modules/order.c execunix.c
fileunix.c
./bootstrap/jam0 -f build.jam --toolset=gcc --toolset-root= clean
...found 1 target...
...updating 1 target...
[DELETE] clean
...updated 1 target...
./bootstrap/jam0 -f build.jam --toolset=gcc --toolset-root= --debug
don't know how to make Jambase
...found 54 targets...
...updating 2 targets...
...can't find 1 target...
...skipped jambase.c for lack of Jambase...
...skipped bjam for lack of jambase.c...
...skipped jam for lack of bjam...
...skipped 3 targets...

Any help ?

--
Fab
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost-build

Re: bjam bug invoking codeworker

by Vladimir Prus :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thursday 22 October 2009 Fabien CHÊNE wrote:

> 2009/10/8 Fabien CHÊNE <fabien.chene@...>:
> > Hello,
> >
> > I encounter a bug invoking codeworker in an 'action', bjam is
> > mysteriously stucked.
> > It used to work with an older version (M11, Boost.Jam 3.1.14), and
> > does not work anymore at least on svn trunk.
>
> I am trying to debug bjam, but i can't build a debug version of bjam,
> build.sh --debug seems to be broken, a JamBase file is requested and
> is no longer present in the build directory:
>
> ./build.sh --debug
> ###
> ### Using 'gcc' toolset.
> ###
> rm -rf bootstrap
> mkdir bootstrap
> gcc -o bootstrap/jam0 command.c compile.c debug.c expand.c glob.c
> hash.c hdrmacro.c headers.c jam.c jambase.c jamgram.c lists.c make.c
> make1.c newstr.c option.c output.c parse.c pathunix.c pathvms.c
> regexp.c rules.c scan.c search.c subst.c timestamp.c variable.c
> modules.c strings.c filesys.c builtins.c pwd.c class.c native.c md5.c
> w32_getreg.c modules/set.c modules/path.c modules/regex.c
> modules/property-set.c modules/sequence.c modules/order.c execunix.c
> fileunix.c
> ./bootstrap/jam0 -f build.jam --toolset=gcc --toolset-root= clean
> ...found 1 target...
> ...updating 1 target...
> [DELETE] clean
> ...updated 1 target...
> ./bootstrap/jam0 -f build.jam --toolset=gcc --toolset-root= --debug
> don't know how to make Jambase
> ...found 54 targets...
> ...updating 2 targets...
> ...can't find 1 target...
> ...skipped jambase.c for lack of Jambase...
> ...skipped bjam for lack of jambase.c...
> ...skipped jam for lack of bjam...
> ...skipped 3 targets...

What system are you on? On Linux, using SVN HEAD, debug build
works just fine. Could you either have an incompetely checkout,
or somehow passed the source via Windows, which then broke
capitalization of file name and convered Jambase into jambase?

- Volodya


>
> Any help ?
>
>
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost-build

Re: bjam bug invoking codeworker

by Fabien Chêne :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> What system are you on? On Linux, using SVN HEAD, debug build
> works just fine. Could you either have an incompetely checkout,
> or somehow passed the source via Windows, which then broke
> capitalization of file name and convered Jambase into jambase?

Today, I am able to build the debug version ... What was wrong, I don't know.
Then I have caught a backtrace for te initial broblem:

#0  0x00a5c7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1  0x00b3aa0d in ___newselect_nocancel () from /lib/tls/libc.so.6
#2  0x0805ee19 in exec_wait () at execunix.c:470
#3  0x0805e8ff in exec_cmd (
    string=0xa10a0f8 "\n
/continuus/fc01260/test/CW/CodeWorker4_5_1/codeworker dummy.cws
bin/gcc-3.4.6/debug/toto.gen -nologo ;\n",
    func=0x80530a7 <make_closure>, closure=0x9fac850, shell=0x0,
    action=0x9e5ecc0 "Jamfile</continuus/fc01260/test>.add-out-suffix",
    target=0x9d0f2d0 "bin/gcc-3.4.6/debug/toto.gen") at execunix.c:328
#4  0x08052a3e in make1c (pState=0x9c92d50) at make1.c:532
#5  0x08052392 in make1 (t=0x9facb10) at make1.c:229
#6  0x08050ffe in make (n_targets=1, targets=0x9d0e6b8, anyhow=0) at make.c:167
#7  0x0804f577 in main (argc=0, argv=0xbff8f0f8,
arg_environ=0xbff8f0fc) at jam.c:515

bjam is stucked line 470:

455 if ( 0 < globs.timeout )
456 {
457     /* Force select() to timeout so we can terminate expired processes.
458     */
459     tv.tv_sec = select_timeout;
460     tv.tv_usec = 0;
461
462     /* select() will wait until: i/o on a descriptor, a signal, or we
463     * time out.
464     */
465     ret = select( fd_max + 1, &fds, 0, 0, &tv );
466  }
467  else
468  {
469      /* select() will wait until i/o on a descriptor or a signal. */
470      ret = select( fd_max + 1, &fds, 0, 0, 0 );
471 }

Any ideas ?
Can it be a bug in CodeWorker ?

--
Fab
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost-build

Re: bjam bug invoking codeworker

by Vladimir Prus :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Monday 26 October 2009 Fabien CHÊNE wrote:

> > What system are you on? On Linux, using SVN HEAD, debug build
> > works just fine. Could you either have an incompetely checkout,
> > or somehow passed the source via Windows, which then broke
> > capitalization of file name and convered Jambase into jambase?
>
> Today, I am able to build the debug version ... What was wrong, I don't know.
> Then I have caught a backtrace for te initial broblem:
>
> #0  0x00a5c7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
> #1  0x00b3aa0d in ___newselect_nocancel () from /lib/tls/libc.so.6
> #2  0x0805ee19 in exec_wait () at execunix.c:470
> #3  0x0805e8ff in exec_cmd (
>     string=0xa10a0f8 "\n
> /continuus/fc01260/test/CW/CodeWorker4_5_1/codeworker dummy.cws
> bin/gcc-3.4.6/debug/toto.gen -nologo ;\n",
>     func=0x80530a7 <make_closure>, closure=0x9fac850, shell=0x0,
>     action=0x9e5ecc0 "Jamfile</continuus/fc01260/test>.add-out-suffix",
>     target=0x9d0f2d0 "bin/gcc-3.4.6/debug/toto.gen") at execunix.c:328
> #4  0x08052a3e in make1c (pState=0x9c92d50) at make1.c:532
> #5  0x08052392 in make1 (t=0x9facb10) at make1.c:229
> #6  0x08050ffe in make (n_targets=1, targets=0x9d0e6b8, anyhow=0) at make.c:167
> #7  0x0804f577 in main (argc=0, argv=0xbff8f0f8,
> arg_environ=0xbff8f0fc) at jam.c:515
>
> bjam is stucked line 470:
>
> 455 if ( 0 < globs.timeout )
> 456 {
> 457     /* Force select() to timeout so we can terminate expired processes.
> 458     */
> 459     tv.tv_sec = select_timeout;
> 460     tv.tv_usec = 0;
> 461
> 462     /* select() will wait until: i/o on a descriptor, a signal, or we
> 463     * time out.
> 464     */
> 465     ret = select( fd_max + 1, &fds, 0, 0, &tv );
> 466  }
> 467  else
> 468  {
> 469      /* select() will wait until i/o on a descriptor or a signal. */
> 470      ret = select( fd_max + 1, &fds, 0, 0, 0 );
> 471 }
>
> Any ideas ?
> Can it be a bug in CodeWorker ?

Not clear yet. When the hang happens, is there still a codeworker process?
Can you attach to it with gdb, and get backtrace? Please check if it has
multiple threads -- using 'info thread' -- if so, use "thread apply all bt"
to get backtrace as opposed to regular "bt"

I see two possible reasons:

        - somehow, bjam does not drain either stdout or stderr of the spawned process,
        so the process is stuck writing to a pipe that has no free space
        - bjam misses process exit

- Volodya




>
>
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost-build

Re: bjam bug invoking codeworker

by Fabien Chêne :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

[...]

>> bjam is stucked line 470:
>>
>> 455 if ( 0 < globs.timeout )
>> 456 {
>> 457     /* Force select() to timeout so we can terminate expired processes.
>> 458     */
>> 459     tv.tv_sec = select_timeout;
>> 460     tv.tv_usec = 0;
>> 461
>> 462     /* select() will wait until: i/o on a descriptor, a signal, or we
>> 463     * time out.
>> 464     */
>> 465     ret = select( fd_max + 1, &fds, 0, 0, &tv );
>> 466  }
>> 467  else
>> 468  {
>> 469      /* select() will wait until i/o on a descriptor or a signal. */
>> 470      ret = select( fd_max + 1, &fds, 0, 0, 0 );
>> 471 }
>>
>> Any ideas ?
>> Can it be a bug in CodeWorker ?
>
> Not clear yet. When the hang happens, is there still a codeworker process?

Definitely.

> Can you attach to it with gdb, and get backtrace? Please check if it has
> multiple threads -- using 'info thread' -- if so, use "thread apply all bt"
> to get backtrace as opposed to regular "bt"

There is only one thread in codeworker, stucked on a tcsetattr call:

(gdb) bt
#0  0x002ca7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1  0x00bbf400 in tcsetattr () from /lib/tls/libc.so.6
#2  0x0808d4bf in initKeyboard () at CGRuntime.cpp:115
#3  0x082258ca in Workspace (this=0xbff323d0) at Workspace.cpp:58
#4  0x0808e595 in CodeWorker::CGRuntime::entryPoint (iNargs=4,
    tsArgs=0xbff32564, executeFunction=0x0) at CGRuntime.cpp:383
#5  0x08241c98 in main ()
(gdb) f 2
#2  0x0808d4bf in initKeyboard () at CGRuntime.cpp:115
115 tcsetattr(0, TCSANOW, &new_settings);
(gdb) l
110 new_settings.c_lflag &= ~(ICANON | ECHO);
111 new_settings.c_iflag &= ~(ISTRIP | IGNCR | ICRNL | INLCR | IXOFF | IXON);
112 new_settings.c_cc[VMIN] = 0;
113 new_settings.c_cc[VTIME] = 0;
114 #ifndef CODEWORKER_GNU_READLINE
115 tcsetattr(0, TCSANOW, &new_settings);
116 #endif
117 signal(SIGINT, catch_sig);
118 signal(SIGHUP, catch_sig);
119 signal(SIGTERM, catch_sig);

(gdb) set print pretty on
(gdb) p new_settings
$2 = {
  c_iflag = 0,
  c_oflag = 1,
  c_cflag = 191,
  c_lflag = 35377,
  c_line = 0 '\0',
  c_cc = "\003\034\000\000\004\000\000\000\021\023\032\000\022\017\027\026",
'\0' <repeats 15 times>,
  c_ispeed = 15,
  c_ospeed = 15
}

--
Fab
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost-build

Re: bjam bug invoking codeworker

by Vladimir Prus :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Monday 26 October 2009 Fabien CHÊNE wrote:

> [...]
> >> bjam is stucked line 470:
> >>
> >> 455 if ( 0 < globs.timeout )
> >> 456 {
> >> 457     /* Force select() to timeout so we can terminate expired processes.
> >> 458     */
> >> 459     tv.tv_sec = select_timeout;
> >> 460     tv.tv_usec = 0;
> >> 461
> >> 462     /* select() will wait until: i/o on a descriptor, a signal, or we
> >> 463     * time out.
> >> 464     */
> >> 465     ret = select( fd_max + 1, &fds, 0, 0, &tv );
> >> 466  }
> >> 467  else
> >> 468  {
> >> 469      /* select() will wait until i/o on a descriptor or a signal. */
> >> 470      ret = select( fd_max + 1, &fds, 0, 0, 0 );
> >> 471 }
> >>
> >> Any ideas ?
> >> Can it be a bug in CodeWorker ?
> >
> > Not clear yet. When the hang happens, is there still a codeworker process?
>
> Definitely.
>
> > Can you attach to it with gdb, and get backtrace? Please check if it has
> > multiple threads -- using 'info thread' -- if so, use "thread apply all bt"
> > to get backtrace as opposed to regular "bt"
>
> There is only one thread in codeworker, stucked on a tcsetattr call:
>
> (gdb) bt
> #0  0x002ca7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
> #1  0x00bbf400 in tcsetattr () from /lib/tls/libc.so.6
> #2  0x0808d4bf in initKeyboard () at CGRuntime.cpp:115
> #3  0x082258ca in Workspace (this=0xbff323d0) at Workspace.cpp:58
> #4  0x0808e595 in CodeWorker::CGRuntime::entryPoint (iNargs=4,
>     tsArgs=0xbff32564, executeFunction=0x0) at CGRuntime.cpp:383
> #5  0x08241c98 in main ()
> (gdb) f 2
> #2  0x0808d4bf in initKeyboard () at CGRuntime.cpp:115
> 115 tcsetattr(0, TCSANOW, &new_settings);
> (gdb) l
> 110 new_settings.c_lflag &= ~(ICANON | ECHO);
> 111 new_settings.c_iflag &= ~(ISTRIP | IGNCR | ICRNL | INLCR | IXOFF | IXON);
> 112 new_settings.c_cc[VMIN] = 0;
> 113 new_settings.c_cc[VTIME] = 0;
> 114 #ifndef CODEWORKER_GNU_READLINE
> 115 tcsetattr(0, TCSANOW, &new_settings);
> 116 #endif
> 117 signal(SIGINT, catch_sig);
> 118 signal(SIGHUP, catch_sig);
> 119 signal(SIGTERM, catch_sig);
>
> (gdb) set print pretty on
> (gdb) p new_settings
> $2 = {
>   c_iflag = 0,
>   c_oflag = 1,
>   c_cflag = 191,
>   c_lflag = 35377,
>   c_line = 0 '\0',
>   c_cc = "\003\034\000\000\004\000\000\000\021\023\032\000\022\017\027\026",
> '\0' <repeats 15 times>,
>   c_ispeed = 15,
>   c_ospeed = 15
> }

If it's really stuck there, e.g. it remains there after "continue + Ctrl-C" in
GDB, it seems that tcsetattr for some reason hangs if stdin is not a terminal.
And when run by bjam, it's naturally not a terminal. I don't immediately
know why this should be a problem, I'll check.

- Volodya
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost-build

Re: bjam bug invoking codeworker

by Fabien Chêne :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

2009/10/26 Vladimir Prus <ghost@...>:

> On Monday 26 October 2009 Fabien CHÊNE wrote:
>
>> [...]
>> >> bjam is stucked line 470:
>> >>
>> >> 455 if ( 0 < globs.timeout )
>> >> 456 {
>> >> 457     /* Force select() to timeout so we can terminate expired processes.
>> >> 458     */
>> >> 459     tv.tv_sec = select_timeout;
>> >> 460     tv.tv_usec = 0;
>> >> 461
>> >> 462     /* select() will wait until: i/o on a descriptor, a signal, or we
>> >> 463     * time out.
>> >> 464     */
>> >> 465     ret = select( fd_max + 1, &fds, 0, 0, &tv );
>> >> 466  }
>> >> 467  else
>> >> 468  {
>> >> 469      /* select() will wait until i/o on a descriptor or a signal. */
>> >> 470      ret = select( fd_max + 1, &fds, 0, 0, 0 );
>> >> 471 }
>> >>
>> >> Any ideas ?
>> >> Can it be a bug in CodeWorker ?
>> >
>> > Not clear yet. When the hang happens, is there still a codeworker process?
>>
>> Definitely.
>>
>> > Can you attach to it with gdb, and get backtrace? Please check if it has
>> > multiple threads -- using 'info thread' -- if so, use "thread apply all bt"
>> > to get backtrace as opposed to regular "bt"
>>
>> There is only one thread in codeworker, stucked on a tcsetattr call:
>>
>> (gdb) bt
>> #0  0x002ca7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
>> #1  0x00bbf400 in tcsetattr () from /lib/tls/libc.so.6
>> #2  0x0808d4bf in initKeyboard () at CGRuntime.cpp:115
>> #3  0x082258ca in Workspace (this=0xbff323d0) at Workspace.cpp:58
>> #4  0x0808e595 in CodeWorker::CGRuntime::entryPoint (iNargs=4,
>>     tsArgs=0xbff32564, executeFunction=0x0) at CGRuntime.cpp:383
>> #5  0x08241c98 in main ()
>> (gdb) f 2
>> #2  0x0808d4bf in initKeyboard () at CGRuntime.cpp:115
>> 115                   tcsetattr(0, TCSANOW, &new_settings);
>> (gdb) l
>> 110                   new_settings.c_lflag &= ~(ICANON | ECHO);
>> 111                   new_settings.c_iflag &= ~(ISTRIP | IGNCR | ICRNL | INLCR | IXOFF | IXON);
>> 112                   new_settings.c_cc[VMIN] = 0;
>> 113                   new_settings.c_cc[VTIME] = 0;
>> 114   #ifndef CODEWORKER_GNU_READLINE
>> 115                   tcsetattr(0, TCSANOW, &new_settings);
>> 116   #endif
>> 117                   signal(SIGINT, catch_sig);
>> 118                   signal(SIGHUP, catch_sig);
>> 119                   signal(SIGTERM, catch_sig);
>>
>> (gdb) set print pretty on
>> (gdb) p new_settings
>> $2 = {
>>   c_iflag = 0,
>>   c_oflag = 1,
>>   c_cflag = 191,
>>   c_lflag = 35377,
>>   c_line = 0 '\0',
>>   c_cc = "\003\034\000\000\004\000\000\000\021\023\032\000\022\017\027\026",
>> '\0' <repeats 15 times>,
>>   c_ispeed = 15,
>>   c_ospeed = 15
>> }
>
> If it's really stuck there, e.g. it remains there after "continue + Ctrl-C" in
> GDB,

Confirmed, we are still waiting for tcsetattr to give up after
"continue, Ctrl-C".
CodeWorker can be compiled with the macro CODEWORKER_GNU_READLINE -- it avoids
tcsetattr; doing that, everything works fine.

> it seems that tcsetattr for some reason hangs if stdin is not a terminal.
> And when run by bjam, it's naturally not a terminal. I don't immediately
> know why this should be a problem, I'll check.

Thanks !

--
Fab
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost-build