[ clisp-Bugs-1575811 ] 2.40 sigsegv on sparc

View: New views
1 Messages — Rating Filter:   Alert me  

[ clisp-Bugs-1575811 ] 2.40 sigsegv on sparc

by SourceForge.net :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Bugs item #1575811, was opened at 2006-10-12 06:01
Message generated for change (Comment added) made by sds
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=101355&aid=1575811&group_id=1355

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: clisp
Group: segfault
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Peter Van Eynde (pvaneynd)
Assigned to: Bruno Haible (haible)
Summary: 2.40 sigsegv on sparc

Initial Comment:
bulding 2.40 with:

./configure debian/build --prefix=/usr --fsstnd=debian
--with-dynamic-ffi --with-dynamic-modules
--with-module=bindings/glibc --with-module=clx/new-clx
--with-debug

results in a crash in lisp.run using gdb I see:

(gdb) run ./lisp.run -B . -N locale -E 1:1 -Efile UTF-8
-Eterminal UTF-8 -norc -m 1800KW -x "(and (load
\"init.lisp\") (sys::%saveinitmem) (ext::exit))
(ext::exit t)"
Starting program:
/home/pvaneynd/clisp/clisp-2.40.orig/debian/build/lisp.run
./lisp.run -B . -N locale -E 1:1 -Efile UTF-8
-Eterminal UTF-8 -norc -m 1800KW -x "(and (load
\"init.lisp\") (sys::%saveinitmem) (ext::exit))
(ext::exit t)"
STACK depth: 98206

Program received signal SIGSEGV, Segmentation fault.
0x0016effc in hash_lookup_builtin (ht={one_o =
3777369}, obj={one_o = 3783721}, allowgc=false,
KVptr_=0xefffe458, Iptr_=0xefffe454) at hashtabl.d:1610
1610      while (!eq(*Nptr,nix)) { /* track "list" :
"list" finished -> not found */
(gdb) print Nptr
$2 = (gcv_object_t *) 0x43f1c16c
(gdb) print /x ht
$4 = {one_o = 0x39a359}
(gdb) frame
#0  0x0016effc in hash_lookup_builtin (ht={one_o =
3777369}, obj={one_o = 3783721}, allowgc=false,
KVptr_=0xefffe458, Iptr_=0xefffe454) at hashtabl.d:1610
1610      while (!eq(*Nptr,nix)) { /* track "list" :
"list" finished -> not found */
(gdb) backtrace
#0  0x0016effc in hash_lookup_builtin (ht={one_o =
3777369}, obj={one_o = 3783721}, allowgc=false,
KVptr_=0xefffe458, Iptr_=0xefffe454) at hashtabl.d:1610
#1  0x00173690 in gethash (obj=Cannot access memory at
address 0x43f1c16c
) at hashtabl.d:2416
#2  0x00284d54 in register_foreign_variable
(address=0x357100, name_asciz=0x32cd00
"ffi_user_pointer", flags=0, size=4) at foreign.d:185
#3  0x002a0edc in init_ffi () at foreign.d:4422
#4  0x0004675c in main (argc=17, argv=0xefffe6c4) at
spvw.d:3345
(gdb) print flags
$5 = 2 '\002'
(gdb) print hashindex
$6 = 1357776787
(gdb) print kvtable
$7 = {one_o = 3777329}
(gdb) print kvt_data
$8 = (gcv_object_t *) 0x39a348


I there are any gdb commands I can execute to help
debug this problem, just ask


----------------------------------------------------------------------

>Comment By: Sam Steingold (sds)
Date: 2009-10-20 12:30

Message:
same crash in 2.32 & 2.34

----------------------------------------------------------------------

Comment By: Sam Steingold (sds)
Date: 2009-10-20 11:24

Message:
note that I observe the crash with "gcc -g -O0"

----------------------------------------------------------------------

Comment By: Sam Steingold (sds)
Date: 2009-10-20 10:16

Message:
same crash:

Program received signal SIGSEGV, Segmentation fault.
0x0016dfb4 in hash_lookup_builtin (ht=..., obj=..., allowgc=false,
    KVptr_=0xffef14d4, Iptr_=0xffef14d0) at ../src/hashtabl.d:1417
1417      while (!eq(*Nptr,nix)) { /* track "list" : "list" finished ->
not found */
(gdb) where
#0  0x0016dfb4 in hash_lookup_builtin (ht=..., obj=..., allowgc=false,
    KVptr_=0xffef14d4, Iptr_=0xffef14d0) at ../src/hashtabl.d:1417
#1  0x0017244c in gethash (obj=..., ht=..., allowgc=false)
    at ../src/hashtabl.d:2219
#2  0x0027dd34 in register_foreign_inttype (name_asciz=0x2ec938 "ssize_t",

    size=4, signed_p=true) at ../src/foreign.d:274
#3  0x0029aab0 in init_ffi () at ../src/foreign.d:4604
#4  0x00046ff4 in main (argc=18, argv=0xffef1784) at ../src/spvw.d:3841

with current cvs head on
 Linux titan 2.6.24.4 #1 SMP Sat Apr 12 20:33:06 UTC 2008 sparc64
GNU/Linux
with
 gcc (Debian 4.3.4-5) 4.3.4


----------------------------------------------------------------------

Comment By: Raymond Toy (rtoy)
Date: 2008-04-02 17:59

Message:
Logged In: YES
user_id=28849
Originator: NO

The crash I reported isn't in the same place, and it's sparc/solaris, not
sparc/linux.

make check is running, and here is the last output before crashing:

(PROGN (DEFGENERIC TESTGF00 (&REST ARGS &KEY) (:METHOD (&REST ARGS)))
(TESTGF00 'A 'B))
[SIMPLE-KEYWORD-ERROR]: #:COMPILED-FORM-180-1: illegal keyword/value pair
A, B in argument list.
The allowed keywords are NIL


Here is part of the backtrace.  I don't think this is the same bug.

#0  top_of_back_trace_frame (bt=0x669400fb) at debug.d:1104
#1  0x00034bc8 in unwind_upto (upto_frame=0xffbe9464) at eval.d:624
#2  0x00043d40 in invoke_handlers (cond=0x1a952901) at eval.d:697
#3  0x000be904 in C_clcs_signal (argcount=0, rest_args_pointer=0x1ba2e4)
at error.d:775
#4  0x00037234 in funcall_subr (fun=0x1a0a7a, args_on_stack=0) at
eval.d:5179
#5  0x000bbb54 in signal_and_debug (condition=0x1a952901) at error.d:204
#6  0x000bbdd4 in end_error (stackptr=0x1ba2c4, start_driver_p=true) at
error.d:317
#7  0x000bbf6c in error (errortype=keyword_error,
    errorstring=0x139a70 "~S: illegal keyword/value pair ~S, ~S in
argument list.\nThe allowed keywords are ~S") at error.d:349
#8  0x000bf380 in error_key_badkw (fun=0x1a951241, key=0x1a9095a9,
val=0x1a9095d9, kwlist=0x1a68d9)
    at error.d:1317
#9  0x00036764 in match_cclosure_key (closure=0x1a952419, argcount=1,
key_args_pointer=0x1ba2bc,
    rest_args_pointer=0x1ba2bc) at eval.d:2803
#10 0x0004361c in apply_closure (closure=0x1a952419, args_on_stack=0,
args=0x1a68d9) at eval.d:4747
#11 0x0003d538 in interpret_bytecode_ (closure_in=0x1a94e491,
codeptr=0x1a950b48,
    byteptr_in=0x1a950b6f "") at eval.d:7737
#12 0x00042e58 in apply_closure (closure=0x1a94e491, args_on_stack=0,
args=0x669429e3)
    at eval.d:4770
#13 0x0003d538 in interpret_bytecode_ (closure_in=0x1a94e491,
codeptr=0x1a94f2c8, byteptr_in=0x0)
    at eval.d:7737
#14 0x0003e764 in eval1 (form=0x1ba2b4) at eval.d:3866
#15 0x0003f0f4 in eval (form=0x66944853) at eval.d:2908


----------------------------------------------------------------------

Comment By: Sam Steingold (sds)
Date: 2008-04-02 09:45

Message:
Logged In: YES
user_id=5735
Originator: NO

the crash is with _some_ gcc version is a very basic part of CLISP.
if you can read assembly, it would be nice if you could figure out which
part of hashtabl.d is miscompiled and file a gcc bug report.

----------------------------------------------------------------------

Comment By: Raymond Toy (rtoy)
Date: 2008-04-01 18:44

Message:
Logged In: YES
user_id=28849
Originator: NO

It's not clear if the bug I reported in that link is in clisp or in gcc.
I didn't investigate the cause.

----------------------------------------------------------------------

Comment By: Sam Steingold (sds)
Date: 2008-03-31 09:58

Message:
Logged In: YES
user_id=5735
Originator: NO

apparently this is a bug in some versions of gcc on solaris.
what is your gcc version?
http://permalink.gmane.org/gmane.lisp.clisp.general/12179
In case any one cares, I tried building clisp 2.44.1 using gcc 3.4.3
on sparc.  The sources appear to build and an image is created.
However, when running make check, the check eventually gets a segfault
that crashes clisp.

I didn't use any special libraries and only had libsigsegv available.
More info available if anyone wants to take a look.

Using the same sources, I rebuilt using gcc 3.3.3, and make check
finishes just fine.  This works for me, so the fact that 3.4.3 doesn't
work is not so important to me.

----------------------------------------------------------------------

Comment By: Sam Steingold (sds)
Date: 2006-10-16 09:23

Message:
Logged In: YES
user_id=5735

this abort means nothing: zout calls PRIN1 which conses and
thus invalidates ht.
note that it has the right allocstamp before zout.

----------------------------------------------------------------------

Comment By: Peter Van Eynde (pvaneynd)
Date: 2006-10-16 02:12

Message:
Logged In: YES
user_id=7267

Breakpoint 16, hash_lookup_builtin (ht={one_o = 5853545,
allocstamp = 67855}, obj={one_o = 5859937, allocstamp =
67855}, allowgc=false, KVptr_=0xefffe410,
    Iptr_=0xefffe40c) at hashtabl.d:1571
1571      GCTRIGGER_IF(allowgc, GCTRIGGER2(ht,obj));
(gdb) xout ht
#(CL::HASH-TABLE size=3 maxcount=1 mincount=0 free=
  test=CL::EQUAL
  KV=#(CL::NIL #(#<UNBOUND> #<UNBOUND> #<UNBOUND>) 0 0
#<UNBOUND> #<UNBOUND> #<UNBOUND>)){one_o = 5853545,
allocstamp = 67855}
(gdb) zout ht
#S(HASH-TABLE :TEST EXT::FASTHASH-EQUAL)
{one_o = 5853545, allocstamp = 68046}
(gdb) print TheHashtable_(ht)

Program received signal SIGABRT, Aborted.
0x5026f910 in kill () from /lib/libc.so.6
The program being debugged was signaled while in a function
called from GDB.
GDB remains in the frame where the signal was received.
To change this behavior use "set unwindonsignal on"
Evaluation of the expression containing the function
(TheHashtable_) will be abandoned.

it seems it aborted:

(gdb) backtrace
#0  0x5026f910 in kill () from /lib/libc.so.6
#1  0x5027093c in abort () from /lib/libc.so.6
#2  0x000cbedc in ngci_pointable (obj={one_o = 5853545,
allocstamp = 67855}) at lispbibl.d:6927
#3  0x0002c54c in TheHashtable_ (x={one_o = 0, allocstamp =
0}) at spvw_debug.d:454
#4  <function called from gdb>
#5  hash_lookup_builtin (ht={one_o = 5853545, allocstamp =
67855}, obj={one_o = 5859937, allocstamp = 67855},
allowgc=false, KVptr_=0xefffe410, Iptr_=0xefffe40c)
    at hashtabl.d:1571
#6  0x002a95d4 in gethash (obj={one_o = 5859937, allocstamp
= 67855}, ht={one_o = 5853545, allocstamp = 67855},
allowgc=false) at hashtabl.d:2416
#7  0x0045250c in register_foreign_variable
(address=0x55298c, name_asciz=0x50f838 "ffi_user_pointer",
flags=0, size=4) at foreign.d:185
#8  0x004832a8 in init_ffi () at foreign.d:4422
#9  0x000c4418 in main (argc=16, argv=0xefffe6d4) at spvw.d:3345
(gdb) down
#2  0x000cbedc in ngci_pointable (obj={one_o = 5853545,
allocstamp = 67855}) at lispbibl.d:6927
6927          abort();
(gdb) list
6922        return obj.one_o;
6923      }
6924      static inline aint ngci_pointable (object obj) {
6925        if (!(gcinvariant_symbol_p(obj)
6926              || obj.allocstamp == alloccount ||
nonimmsubrp(obj)))
6927          abort();
6928        nonimmprobe(obj.one_o);
6929        return obj.one_o;
6930      }
6931      static inline aint ngci_pointable (gcv_object_t obj) {
(gdb) print obj
$1 = {one_o = 5853545, allocstamp = 67855}
(gdb) print alloccount
$2 = 68046
(gdb) print gcinvariant_symbol_p(obj)
$3 = false

nonimmprobe is a macro and difficult to recreate it seems,
but I can check if the pointer works:

(gdb) print obj.one_o
$14 = 5853545
(gdb) print /x obj.one_o
$15 = 0x595169
(gdb) x /xw obj.one_o
0x595169:       0x59516904

I will retry now with 2.41.

----------------------------------------------------------------------

Comment By: Sam Steingold (sds)
Date: 2006-10-13 15:32

Message:
Logged In: YES
user_id=5735

debug_gcsafety detects gcsafety bugs _before_ the GC that
crashes.  obviously, this is not such a bug. too bad.
how about setting a break in hash_lookup_builtin and doing
-before- the segfault (I assume that this is the first time
you enter hash_lookup_builtin):
(gdb) xout ht
(gdb) zout ht
(gdb) print TheHashtable_(ht)
and examining the slots.


----------------------------------------------------------------------

Comment By: Peter Van Eynde (pvaneynd)
Date: 2006-10-13 14:47

Message:
Logged In: YES
user_id=7267

I did the rebuild with g++-3.3 (on sparc/linux as you
guessed) and after fixing a trivial casting problem I get:

(gdb) run -B . -N locale -E 1:1 -Efile UTF-8 -Eterminal
UTF-8 -norc -m 1800KW -x "(and (load \"init.lisp\")
(sys::%saveinitmem) (ext::exit)) (ext::exit t)"
Starting program:
/home/pvaneynd/clisp/clisp-2.40.orig/debian/build/lisp.run
-B . -N locale -E 1:1 -Efile UTF-8 -Eterminal UTF-8 -norc -m
1800KW -x "(and (load \"init.lisp\") (sys::%saveinitmem)
(ext::exit)) (ext::exit t)"
STACK depth: 230302

Program received signal SIGSEGV, Segmentation fault.
0x002a12bc in hash_lookup_builtin (ht={one_o = 5853545,
allocstamp = 67855}, obj={one_o = 5859937, allocstamp =
67855}, allowgc=false, KVptr_=0xefffe410,
    Iptr_=0xefffe40c) at hashtabl.d:1610
1610      while (!eq(*Nptr,nix)) { /* track "list" : "list"
finished -> not found */
Warning: the current language does not match this frame.


Also as the failure is rather fast I guess there has been no
gc yet.


----------------------------------------------------------------------

Comment By: Sam Steingold (sds)
Date: 2006-10-12 18:45

Message:
Logged In: YES
user_id=5735

you didn't specify it, but, presumably, this is linux/sparc
(not solaris).

----------------------------------------------------------------------

Comment By: Sam Steingold (sds)
Date: 2006-10-12 11:30

Message:
Logged In: YES
user_id=5735

one thing you could do is build with g++
to check for GC safety.
http://clisp.cons.org/impnotes/gc-safety.html
CC=g++ ./configure --with-debug build-g-gxx
thanks.

----------------------------------------------------------------------

You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=101355&aid=1575811&group_id=1355

------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
clisp-devel mailing list
clisp-devel@...
https://lists.sourceforge.net/lists/listinfo/clisp-devel