WARNING: This server is unstable and will be retired in the next days. If you want to keep this forum available, please request immediately a migration on the Nabble Support forum. Forums that don't receive any migration request will be deleted forever.

heisenbug disabling BBox

View: New views
4 Messages — Rating Filter:   Alert me  

heisenbug disabling BBox

by Bernhard R. Link-2 :: Rate this Message:

| View Threaded | Show Only this Message

Somewhere there is a ugly little bug hiding I fail
to properly locate. The symptom is gv not using the
BoundingBox of a specific eps file. (Some eps file
with which it works here is in the original report
found at http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=627471
).

The faulty behaviour appears and vanishes under
very strange situations:

The original reporter had it depending on giving
a relative filename or a absolute one (I could never
reproduce that).

Things I have seen that make it go away:

- setting LC_CTYPE or LC_ALL (even to C) makes it disappear,
  only with none of those variables set it shows up here.

- running in valgrind makes it disappear

- gdb does not seem to have any effect

- compiling some things without optimisation can make
  it go away, but in a way that I think this mostly is related
  to positions within the binary changing.

- adding some asm("nop") at the right locations can make it
  disappear, on some locations it has no effect, on other locations
  one need multiple nops.

- changing the link order (i.e. doing the link manually and
  switch the order of .o files at the command line) can make
  the difference between bug or no bug, though I guess that is
  also mostly about changing of offsets and/or alignment
  within the code.

- the smallest difference I yet come up with is the following:
  + compile everything "-O2 -g" then the bug is there.
  + compile ps.c to .s by replacing "-o ps.o" with "-S -o ps.s"
  + add '.string "123457"' after '.string "PageBoundingBox:"'
  + rm ps.o ; as -o ps.o -c ps.s
  + rm gv ; make
  + bug is still there
  + move the '.string "1234567"' up to be between
    '.string "PageMedia:"' and the label directly before
    '.string "PageBoundingBox:'
  + rm ps.o ; as -o ps.o -c ps.s
  + rm gv ; make
  + bug is gone

  objdump -s of the two gv fles generated this way show as only
  differences: 4 bytes changed in .text and the "123457\0"
  switching it's position with "PageBoundingBox\0" in .rodata

  objdump -d says that those 4 differences in the code are
  4 times the 2. argument for dsc_strncmp (two times for
  "PageBoundingBox:" and two times for "BoundingBox:" (gcc
  is storing those two strings as one))

  Looking at dsc_strncmp I see nothing that could explain why
  a difference like that could have effects like that.
  As it does this funny malloc/free every time (no idea why
  it does that as it could just to strncasecmp(s1, s2, n-1))
  that might mean that there simply is some harvoc going on
  with the memory mangement code. Optimising that function to
  not do the temporary copy makes the bug disappear, but that
  might simply be a code moves around effect....

        Bernhard R. Link


Re: heisenbug disabling BBox

by Bernhard R. Link-2 :: Rate this Message:

| View Threaded | Show Only this Message

* Bernhard R. Link <brlink@...> [110803 12:17]:
>   Looking at dsc_strncmp I see nothing that could explain why
>   a difference like that could have effects like that.
>   As it does this funny malloc/free every time (no idea why
>   it does that as it could just to strncasecmp(s1, s2, n-1))
>   that might mean that there simply is some harvoc going on
>   with the memory mangement code. Optimising that function to
>   not do the temporary copy makes the bug disappear, but that
>   might simply be a code moves around effect....

Actually, the bug still show up with the malloc/free/strncpy
removed with the following patch:

index 248081c..ccc3ca1 100644
--- a/gv/src/ps.c
+++ b/gv/src/ps.c
@@ -115,17 +119,10 @@ static int dsc_strncmp(s1, s2, n)
 {
- char *tmp;
-
  if (strncasecmp(s1, s2, n) == 0)
  return 0;
  if (s2[n-1] == ':'){
- tmp = (char *) malloc(n*sizeof(char));
- strncpy(tmp, s2, (n-1));
- tmp[n-1]=' ';
- if (strncasecmp(s1, tmp, n) == 0){
- free(tmp);
+ if (strncasecmp(s1, s2, n-1) == 0 && s1[n-1] == ' '){
  return 0;
  }
- free(tmp);
  }
 
  return 1;

In other words: I'm totally at loss how this effect can
cause this. I will try to run it in the debugger with some
read watchpoints for the changed parts to see where it can
have a difference, but ....

        Bernhard R. Link


Re: heisenbug disabling BBox

by Bernhard R. Link-2 :: Rate this Message:

| View Threaded | Show Only this Message

* Bernhard R. Link <brlink@...> [110803 13:06]:
> In other words: I'm totally at loss how this effect can
> cause this. I will try to run it in the debugger with some
> read watchpoints for the changed parts to see where it can
> have a difference, but ....

I've finally found the bug:

ps.c is using some sec_sscanf (from secscanf.c) instead of
regular sscanf or instead of doing some proper parsing.

As sec_sscanf differs from regular sscanf about it variadic
arguments gcc cannot test if the arguments given match the
format string, especially it is lost about sec_sscanf
wanting a 'char *' and a 'size_t' for ever '%s' or '%256s'
it gets. Thus when ps.c does

sec_sscanf(line+lenght("%%BoundingBox:), "%256s", text);

the size of text field is not given, so some random value
is returned by the 'va_arg(ap, size_t)' in secscanf.c
If that random value is smaller than the length of "(atend)"
then this will be copied incompletely and thus
not be recognized.

        Bernhard R. Link


Re: heisenbug disabling BBox

by Markus Steinborn-2 :: Rate this Message:

| View Threaded | Show Only this Message

Hi Bernhard,

Bernhard R. Link wrote:
> I've finally found the bug:
>    
Excellent work. Thank you very much.

I've just applied your patches from yesterday and I updated every call
of sec_sscanf having %s in its format descriptor to include the buffer
length.


Greetings

Markus Steinborn
GNU gv maintainer