Willing to debug bug #3542 (23.0.94; File access via UNC path slow again under Windows)

View: New views
9 Messages — Rating Filter:   Alert me  
< Prev | 1 - 2 - 3 | Next >

Re: Willing to debug bug #3542 (23.0.94; File access via UNC path slow again under Windows)

by Haojun Bao-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Eli Zaretskii <...> writes:

>> From: Stefan Monnier <monnier@...>
>> Cc: Andreas Schwab <schwab@...>,  emacs-devel@...
>> Date: Mon, 13 Jul 2009 20:54:14 -0400
>>
>> > Aha.  But it sounds like it's not just me who is confused.  Here's
>> > just two examples:
>>
>> >   From doc.c:
>>
>> >       strp = SDATA (string);
>> >       while (strp < SDATA (string) + SBYTES (string))
>>
>> >   (why not "while *strp"?)
>>
>> As said Andreas, this would stop at the first NUL, which may appear
>> within the string.
>
> But then a gazillion other places are buggy: we _do_ use SDATA(str) as
> a C string, and pass it to functions that will stop examining the
> string on the first null.  A random example:
>
>   d = opendir (SDATA (Fdirectory_file_name (encoded_dir)));
>
> What am I missing?

NUL can not be in a pathname, so it should be safe here to treat it as a
classical C string.



Re: Willing to debug bug #3542 (23.0.94; File access via UNC path slow again under Windows)

by Stefan Monnier :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

>> >   From fileio.c:
>> >       nm = (unsigned char *) alloca (SBYTES (filename) + 1);
>> >       bcopy (SDATA (filename), nm, SBYTES (filename) + 1);
>> >   (why +1? it potentially accesses memory beyond end of `filename's
>> >   contents)
>> The +1 is precisely used to make sure we copy the terminating NUL.
> That's not my reading of allocate_string_data.  Are you sure?

I'm not sure about any piece of code, no.  Bugs are common.
But my reading of allocate_string_data starts by:

   /* Set up Lisp_String S for holding NCHARS characters, NBYTES bytes,
      plus a NUL byte at the end.  Allocate an sdata structure for S, and

> Anyway, if that's true, then again we have bugs in other places.  Like
> this one:

>   directory_nbytes = SBYTES (directory);
>   if (directory_nbytes == 0
>       || !IS_ANY_SEP (SREF (directory, directory_nbytes - 1)))
>     needsep = 1;
>   [...]
>  int nbytes = len + directory_nbytes + needsep;
>  fullname = make_uninit_multibyte_string (nbytes, nbytes);
>  bcopy (SDATA (directory), SDATA (fullname),
> directory_nbytes);

make_uninit_multibyte_string calls allocate_string_data which does

  STRING_DATA (s)[nbytes] = '\0';

so the destination of the `bcopy' already has the terminating NUL.

So, I'm overall pretty sure our strings *should* always have a NUL right
after the last byte.  OTOH I'd be surprised if there isn't a bug
somewhere that makes it possible for some strings to occasionally fail
to have this terminating NUL (most likely, any bug that leads to
a crash could also lead to such a situation).


        Stefan



Re: Willing to debug bug #3542 (23.0.94; File access via UNC path slow again under Windows)

by Eli Zaretskii :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> From: Miles Bader <miles@...>
> Cc: Stefan Monnier <monnier@...>, schwab@...,
>         emacs-devel@...
> Date: Tue, 14 Jul 2009 13:28:19 +0900
>
> >   d = opendir (SDATA (Fdirectory_file_name (encoded_dir)));
>
> That's a limitation of opendir (and posix filenames in general), and
> there's absolutely nothing we can do it (so there's no point in worrying
> about that case)

That's not what I meant.  What I meant is the problem that could
happen if encoded_dir's value was foo^@bar, and there was also a
directory called "foo".  Won't we then opendir "foo"?

And what if opendir was replaced by unlink?



Re: Willing to debug bug #3542 (23.0.94; File access via UNC path slow again under Windows)

by Davis Herring :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> That's not what I meant.  What I meant is the problem that could
> happen if encoded_dir's value was foo^@bar, and there was also a
> directory called "foo".  Won't we then opendir "foo"?

Of course.  But is the equivalence between "foo" and "foo^@bar"
fundamentally worse than the (almost: consider symlinks) equivalence
between "foo" and "foo/"?  Maybe it just doesn't matter much.

> And what if opendir was replaced by unlink?

Nothing.  You can't unlink directories.  ;)

Davis

--
This product is sold by volume, not by mass.  If it appears too dense or
too sparse, it is because mass-energy conversion has occurred during
shipping.



Re: Willing to debug bug #3542 (23.0.94; File access via UNC path slow again under Windows)

by Eli Zaretskii :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> From: Stefan Monnier <monnier@...>
> Cc: schwab@...,  emacs-devel@...
> Date: Tue, 14 Jul 2009 14:18:53 -0400
>
> >   directory_nbytes = SBYTES (directory);
> >   if (directory_nbytes == 0
> >       || !IS_ANY_SEP (SREF (directory, directory_nbytes - 1)))
> >     needsep = 1;
> >   [...]
> >  int nbytes = len + directory_nbytes + needsep;
> >  fullname = make_uninit_multibyte_string (nbytes, nbytes);
> >  bcopy (SDATA (directory), SDATA (fullname),
> > directory_nbytes);
>
> make_uninit_multibyte_string calls allocate_string_data which does
>
>   STRING_DATA (s)[nbytes] = '\0';
>
> so the destination of the `bcopy' already has the terminating NUL.

Perhaps most of the places where we use these paradigms are okay due
to all these subtle corners that together make everything work.  But
IMHO it's inherently unsafe to use character arrays that are not true
C strings as if they were C strings.  For one, it violates the mental
model each C programmer has about strings, and that can easily lead to
misunderstanding, confusion, and bugs.  For example (from dbusbind.c):

  char x[DBUS_MAXIMUM_SIGNATURE_LENGTH];
  [...]
        strcpy (x, SDATA (CAR_SAFE (XD_NEXT_VALUE (elt))));
  [...]
      sprintf (signature, "%c%s", dtype, x);

or

      case DBUS_TYPE_SIGNATURE:
        {
          char *val = SDATA (Fstring_make_unibyte (object));
          XD_DEBUG_MESSAGE ("%c %s", dtype, val);

How can one convince herself that this code is safe without knowing
too much about the Lisp strings whose data gets handled here as C
strings?  Can they have embedded nulls or cannot they?

Same here (editfns.c):

      if (SBYTES (val) > message_length)
        {
          message_length = SBYTES (val);
          message_text = (char *)xrealloc (message_text, message_length);
        }
      bcopy (SDATA (val), message_text, SBYTES (val));
      message2 (message_text, SBYTES (val),
                STRING_MULTIBYTE (val));

message_text[] is not a C string here, because it's not
null-terminated (and doesn't have enough space to be terminated).
Without looking at the implementation of message2, whose 1st arg is a
`char *', how can one know that there's no bug here?

Or here (search.c):

          raw_pattern_size = SCHARS (string);
          raw_pattern_size_byte = SCHARS (string);
          raw_pattern = (unsigned char *) alloca (raw_pattern_size + 1);
          copy_text (SDATA (string), raw_pattern,
                     SBYTES (string), 1, 0);

raw_pattern[] is not null-terminated, and we then use it, directly and
indirectly, in many places.  Without studying each use, there's no way
you can determine that there cannot be a bug here.

Etc., etc. -- I see other places where maybe it works, maybe it
doesn't.  One needs to study the code very carefully and look at many
functions up and down the call stack, just to determine if a few lines
don't constitute a bug.



Re: Willing to debug bug #3542 (23.0.94; File access via UNC path slow again under Windows)

by Eli Zaretskii :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> Date: Tue, 14 Jul 2009 12:32:18 -0700 (PDT)
> From: "Davis Herring" <herring@...>
> Cc: "Miles Bader" <miles@...>, emacs-devel@...
>
> > That's not what I meant.  What I meant is the problem that could
> > happen if encoded_dir's value was foo^@bar, and there was also a
> > directory called "foo".  Won't we then opendir "foo"?
>
> Of course.  But is the equivalence between "foo" and "foo^@bar"
> fundamentally worse than the (almost: consider symlinks) equivalence
> between "foo" and "foo/"?

Yes, it is (IMO): directory "foo^@bar" does not exist, so you should
have got an error.  Instead, you silently open a directory by another
name, as if "foo^@bar" existed.



Re: Willing to debug bug #3542 (23.0.94; File access via UNC path slow again under Windows)

by Miles Bader-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Eli Zaretskii <eliz@...> writes:
> Yes, it is (IMO): directory "foo^@bar" does not exist, so you should
> have got an error.  Instead, you silently open a directory by another
> name, as if "foo^@bar" existed.

So you're suggesting that each use of a system function with such
limitations be preceded by "if (contains_null(string)) error ();" or something?

I suppose, though it seems such an obscure issue that I guess I don't
really care.

-Miles

--
Next to fried food, the South has suffered most from oratory.
  -- Walter Hines Page



Re: Willing to debug bug #3542 (23.0.94; File access via UNC path slow again under Windows)

by Eli Zaretskii :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> From: Miles Bader <miles@...>
> Cc: herring@...,  emacs-devel@...
> Date: Wed, 15 Jul 2009 05:27:34 +0900
>
> Eli Zaretskii <eliz@...> writes:
> > Yes, it is (IMO): directory "foo^@bar" does not exist, so you should
> > have got an error.  Instead, you silently open a directory by another
> > name, as if "foo^@bar" existed.
>
> So you're suggesting that each use of a system function with such
> limitations be preceded by "if (contains_null(string)) error ();" or something?

Yes, something like that.  We always do a CHECK_STRING in those cases,
at least for the arguments of the primitives; maybe we should have
CHECK_FILENAME or something.



Re: Willing to debug bug #3542 (23.0.94; File access via UNC path slow again under Windows)

by David Kastrup :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

"Davis Herring" <herring@...> writes:

>> That's not what I meant.  What I meant is the problem that could
>> happen if encoded_dir's value was foo^@bar, and there was also a
>> directory called "foo".  Won't we then opendir "foo"?
>
> Of course.  But is the equivalence between "foo" and "foo^@bar"
> fundamentally worse than the (almost: consider symlinks) equivalence
> between "foo" and "foo/"?  Maybe it just doesn't matter much.
>
>> And what if opendir was replaced by unlink?
>
> Nothing.  You can't unlink directories.  ;)

As root under Solaris, you can.  This has been a nasty cause of surprise
for decades.  I can't vouch that it has not changed in _very_ recent
years, but there certainly will systems be around where this is the
case.

--
David Kastrup



< Prev | 1 - 2 - 3 | Next >