Docs on multi-threaded debugging using gdbserver?

View: New views
2 Messages — Rating Filter:   Alert me  

Docs on multi-threaded debugging using gdbserver?

by Grant Edwards :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Is there any documentation on how to set up for remote
multi-threaded debugging using gdb-server?

Single-threaded debugging works fine on all my targets, but I'm
trying to debug a multi-threaded app remotely using gdbserver
and I'm having zero luck.  I've tried on both a glibc/PPC
target and a uclibc/ARM9 target.

On the PPC target, gdb seem completely unaware of threads.  I
haven't spent much time beating my head agains that wall...

On the ARM target, gdb seems vaguely aware that there are
multiple threads, but can do nothing usable with/about them.

Here's a typical session:

    GNU gdb 6.8
    Copyright (C) 2008 Free Software Foundation, Inc.
    License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
    This is free software: you are free to change and redistribute it.
    There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
    and "show warranty" for details.
    This GDB was configured as "--host=i386-pc-linux-gnu --target=arm-linux-uclibc".
    [New Thread 860]
    0x40000930 in _start () from /home/grante/processors/atmel/RM9200/buildroot-2009.05/project_build_arm/uclibc/root/lib/ld-uClibc.so.0
    (gdb) break main
    Breakpoint 1 at 0x8bf4: file hello.c, line 45.
    (gdb) c
    Continuing.
   
    Breakpoint 1, main () at hello.c:45
    45        pthread_attr_t attr, *attrp=NULL;
    (gdb) next
    48        pthread_mutex_init(&printfMutex,NULL);
    (gdb)
    50        tprintf("main()\n");
    (gdb)
    52        s = pthread_attr_init(&attr);                            abort_if_error(s);
    (gdb)
    53        s = pthread_attr_setschedpolicy(&attr, SCHED_RR);        abort_if_error(s);
    (gdb)
    54        s = pthread_attr_getschedparam(&attr, &sp);              abort_if_error(s);
    (gdb)
    55        sp.sched_priority = 50;
    (gdb)
    56        s = pthread_attr_setschedparam(&attr, &sp);              abort_if_error(s);
    (gdb)
    60        for (i=0; i<NumPorts; ++i)
    (gdb)
    62            s = pthread_create(tid+i, attrp, thread, (void*)i);  abort_if_error(s);
    (gdb)
   
    Program received signal SIG32, Real-time event 32.
    0x400443d4 in ?? ()
    (gdb)
    Cannot find bounds of current function
    (gdb) next
    Cannot find bounds of current function
    (gdb) c
    Continuing.
   
    Program received signal SIG32, Real-time event 32.
    0x400443d4 in ?? ()
    (gdb) c
    Continuing.
    [New Thread 863]
   
    Program received signal ?, Unknown signal.
    [Switching to Thread 863]
    0x40044328 in ?? ()
    (gdb) info threads
    [New Thread 861]
    [New Thread 862]
      4 Thread 862  0x40065500 in ?? ()
      3 Thread 861  0x40043718 in ?? ()
    * 2 Thread 863  0x40044328 in ?? ()
      1 Thread 860  0x400443d4 in ?? ()
    warning: Couldn't restore frame in current thread, at frame 0
    0x40044328 in ?? ()
    (gdb) info thread
      4 Thread 862  0x40065500 in ?? ()
      3 Thread 861  0x40043718 in ?? ()
    * 2 Thread 863  0x40044328 in ?? ()
      1 Thread 860  0x400443d4 in ?? ()
    warning: Couldn't restore frame in current thread, at frame 0
    0x40044328 in ?? ()
    (gdb)
    ) c
    Continuing.
    warning: Remote failure reply: E01
   
Here's .gdbinit:

    set sysroot /home/grante/processors/atmel/RM9200/buildroot-2009.05/project_build_arm/uclibc/root
    set solib-absolute-prefix /home/grante/processors/atmel/RM9200/buildroot-2009.05/project_build_arm/uclibc/root
    file hello
    target remote 10.0.0.98:12345

   
I've spent hours reading through web postings and mailing-list
threads and have found little helpful info.  What I've found is
that a lot of people have problems with this, and there is a
lot of contradictory and vague information.

I've found advice saying you have to point sysroot to the root
of the target filesystem. I've found advice saying to point
solib-absolute-prefix to the target root.  (I've tried both --
together and individually).  I've found advice saying point
sysroot to the location of the dev-host libraries for the
cross-toolchains.

I've found _lots_ of postings saying things like "your
libraries need to be unstripped".  Unfortunately nobody deigns
to specify _which_ libraries they're talking about on _which_
system (dev-host or target).

I've also found plenty of suggestions like "your
LD_LIBRARY_PATH needs to include the location of
libthread_db.so".  Well _I_ don't have a LD_LIBRARY path.  I
have two different machines that have them, and I have
libthrad_db.so files in at least 4 different locations.

Which host needs LD_LIBRARY_PATH to point to which
libthread_db.so when running what executable?

I've search the gdb docs, and can't find any info on how remote
multi-threaded debugging works when using gdb-server.

I have found a few people who claim it works for them, but the
seem unable to provide enough details to allow others to
duplicate their success.

--
Grant Edwards                   grante             Yow! I wonder if I ought
                                  at               to tell them about my
                               visi.com            PREVIOUS LIFE as a COMPLETE
                                                   STRANGER?




Re: Docs on multi-threaded debugging using gdbserver?

by Grant Edwards :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On 2009-06-19, Grant Edwards <grante@...> wrote:

> Is there any documentation on how to set up for remote
> multi-threaded debugging using gdb-server?

[...]

> Here's a typical session:
>
>     GNU gdb 6.8

[target remote ....]

[...]
>     0x40000930 in _start () from /home/grante/processors/atmel/RM9200/buildroot-2009.05/project_build_arm/uclibc/root/lib/ld-uClibc.so.0
>     (gdb) break main
>     Breakpoint 1 at 0x8bf4: file hello.c, line 45.
>     (gdb) c
>     Continuing.
>    
>     Breakpoint 1, main () at hello.c:45
>     45        pthread_attr_t attr, *attrp=NULL;
>     (gdb) next
[...]
>     62            s = pthread_create(tid+i, attrp, thread, (void*)i);  abort_if_error(s);
>     (gdb) next
>    
>     Program received signal SIG32, Real-time event 32.
>     0x400443d4 in ?? ()
>     (gdb)
>     Cannot find bounds of current function

Using strace I've verified that

 1) The dev-host gdb is finding the target-arch .so files
    correctly (I think that can also be deduced from the line
    above that says "in _start () from ..../ld-uClibc.so.0")

 2) The target gdbserver is finding libthread_db.so.

I've also confirmed that the target arch libpthread .so files
are not stripped (either on the dev-host or on the target).

I've tried using both a stripped and unstripped application
binary on the target.  [I always use an unstripped ELF
executable that still has debug symbols on the dev-host as the
target for gdb's "file" command.]

I've got strace output for both gdb and gdbserver, and it
always shows gdbserver responding to a $vCont command with $T4d
immediately before gdb prints the message about SIG32.  The
$vCont command in question was the one that was issued as a
result of the "next" command that was stepping through the call
to pthread_create() shown at source file line 62 above:

  ------------------------------gdb strace------------------------------
  send(5, "$vCont;c#a8"..., 11, 0)        = 11
  select(6, [5], NULL, [5], {1, 0})       = 1 (in [5], left {1, 0})
  recv(5, "+"..., 8192, 0)                = 1
  rt_sigaction(SIGINT, {0x8067350, [INT], SA_RESTART}, {0x80d27f0, [INT], SA_RESTART}, 8) = 0
  select(6, [5], NULL, [5], {1, 0})       = 1 (in [5], left {0, 976000})
  recv(5, "$T4d0b:3ccd8dbe;0d:20cb8dbe;0f:d4"..., 8192, 0) = 54
  send(5, "+"..., 1, 0)                   = 1
  rt_sigaction(SIGINT, {0x80d27f0, [INT], SA_RESTART}, {0x8067350, [INT], SA_RESTART}, 8) = 0
  write(1, "\n"..., 1)                    = 1
  write(1, "Program received signal SIG32, Re"..., 51) = 51
  -----------------------------------------------------------------------


Here's the corrsponding section of the gdbserver strace

  ------------------------------gdbserver strace--------------------------
  read(4, "$vCont;c#a8"..., 4096)         = 11
  write(4, "+"..., 1)                     = 1
  rt_sigaction(SIGIO, {0xc608, [IO], SA_RESTART|0x4000000}, {0x1, [IO], SA_RESTART|0x4000000}, 8) = 0
  ptrace(PTRACE_PEEKTEXT, 791, 0x8920, [0xe28fc600]) = 0
  ptrace(PTRACE_GETREGS, 791, 0, 0x23a08) = 0
  ptrace(PTRACE_SETREGS, 791, 0, 0x23a08) = 0
  ptrace(PTRACE_CONT, 791, 0, SIG_0)      = 0
  --- SIGCHLD (Child exited) @ 0 (0) ---
  wait4(-1, [{WIFSTOPPED(s) && WSTOPSIG(s) == SIGTRAP} | 0x30000], WNOHANG, NULL) = 791
  ptrace(0x4201 /* PTRACE_??? */, 791, 0, 0xbefdcc10) = 0
  wait4(792, [{WIFSTOPPED(s) && WSTOPSIG(s) == SIGSTOP}], __WALL, NULL) = 792
  ptrace(0x4200 /* PTRACE_??? */, 792, 0, 0x8) = 0
  ptrace(PTRACE_CONT, 792, 0, SIG_0)      = 0
  ptrace(PTRACE_CONT, 791, 0, SIG_0)      = 0
  wait4(-1, 0xbefdcc14, WNOHANG, NULL)    = 0
  wait4(-1, 0xbefdcc14, WNOHANG|__WCLONE, NULL) = 0
  nanosleep({0, 1000000}, 0)              = ? ERESTART_RESTARTBLOCK (To be restarted)
  --- SIGCHLD (Child exited) @ 0 (0) ---
  restart_syscall(<... resuming interrupted call ...>) = 0
  wait4(-1, 0xbefdcc14, WNOHANG, NULL)    = 0
  wait4(-1, [{WIFSTOPPED(s) && WSTOPSIG(s) == SIGSTOP}], WNOHANG|__WCLONE, NULL) = 793
  wait4(-1, 0xbefdcc14, WNOHANG, NULL)    = 0
  wait4(-1, [{WIFSTOPPED(s) && WSTOPSIG(s) == SIGTRAP} | 0x30000], WNOHANG|__WCLONE, NULL) = 792
  ptrace(0x4201 /* PTRACE_??? */, 792, 0, 0xbefdcc10) = 0
  ptrace(0x4200 /* PTRACE_??? */, 793, 0, 0x8) = 0
  ptrace(PTRACE_CONT, 793, 0, SIG_0)      = 0
  ptrace(PTRACE_CONT, 792, 0, SIG_0)      = 0
  --- SIGCHLD (Child exited) @ 0 (0) ---
  wait4(-1, [{WIFSTOPPED(s) && WSTOPSIG(s) == SIGRTMIN}], WNOHANG, NULL) = 791
  syscall(0x9000ee, 0x318, 0x13, 0, 0x23b48) = 0
  --- SIGCHLD (Child exited) @ 0 (0) ---
  syscall(0x9000ee, 0x319, 0x13, 0, 0)    = 0
  --- SIGCHLD (Child exited) @ 0 (0) ---
  wait4(792, 0xbefdcbf4, WNOHANG, NULL)   = -1 ECHILD (No child processes)
  wait4(792, [{WIFSTOPPED(s) && WSTOPSIG(s) == SIGSTOP}], WNOHANG|__WCLONE, NULL) = 792
  wait4(793, 0xbefdcbf4, WNOHANG, NULL)   = -1 ECHILD (No child processes)
  wait4(793, [{WIFSTOPPED(s) && WSTOPSIG(s) == SIGSTOP}], WNOHANG|__WCLONE, NULL) = 793
  ptrace(PTRACE_GETREGS, 791, 0, 0x23c68) = 0
  rt_sigaction(SIGIO, {0x1, [IO], SA_RESTART|0x4000000}, {0xc608, [IO], SA_RESTART|0x4000000}, 8) = 0
  write(4, "$T4d0b:4cfdc6be;0d:30fbc6be;0f:d4"..., 54) = 54
  --- SIGIO (I/O possible) @ 0 (0) ---
  ------------------------------------------------------------------------

I'm trying to figure out which end is broken (gdb or
gdbserver), but I can't find any documetnation on what is
supposed to happen when a new thread is created.  My guess is
that gdbserver isn't working right, but that's just a hunch.

--
Grant Edwards                   grante             Yow! Hold the MAYO & pass
                                  at               the COSMIC AWARENESS ...
                               visi.com