Quick and dirty malloc() support for realpath.

View: New views
20 Messages — Rating Filter:   Alert me  
< Prev | 1 - 2 | Next >

Quick and dirty malloc() support for realpath.

by Rob Landley :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Add cheesy malloc() support to realpath().  Still limited to PATH_MAX, but eh.

diff -ur uClibc/libc/stdlib/realpath.c uClibc.new/libc/stdlib/realpath.c
--- uClibc/libc/stdlib/realpath.c 2008-06-04 09:02:56.000000000 -0500
+++ uClibc.new/libc/stdlib/realpath.c 2009-10-25 13:17:42.000000000 -0500
@@ -55,7 +55,7 @@
  char *max_path;
  char *new_path;
  size_t path_len;
- int readlinks = 0;
+ int readlinks = 0, allocated = 0;
 #ifdef S_IFLNK
  int link_len;
 #endif
@@ -68,6 +68,10 @@
  __set_errno(ENOENT);
  return NULL;
  }
+ if (!got_path) {
+ got_path = alloca(PATH_MAX);
+ allocated++;
+ }
  /* Make a copy of the source path since we may need to modify it. */
  path_len = strlen(path);
  if (path_len >= PATH_MAX - 2) {
@@ -168,5 +172,6 @@
  new_path--;
  /* Make sure it's null terminated. */
  *new_path = '\0';
+ if (allocated) got_path = strdup(got_path);
  return got_path;
 }

--
Latency is more important than throughput. It's that simple. - Linus Torvalds
_______________________________________________
uClibc mailing list
uClibc@...
http://lists.busybox.net/mailman/listinfo/uclibc

Re: Quick and dirty malloc() support for realpath.

by Rob Landley :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Forgot to include the changes to the header file in the first patch.
(Oops.  Without that, the compiler optimizes out the null test because
it has a crazy "this can never be null" optimization hint applied.)

No idea what the _THROW is for and whether or not I should have
removed that or left it?  (It's mentioned in docs/defines.txt but
that's all, no explanation.)

I'm guessing _wur is warn unused return?  That's not mentioned in
docs at all...

Rob
--

Add cheesy malloc() support to realpath().  Still limited to PATH_MAX, but eh.

diff -ur uClibc/libc/stdlib/realpath.c uClibc.new/libc/stdlib/realpath.c
--- uClibc/libc/stdlib/realpath.c 2008-06-04 09:02:56.000000000 -0500
+++ uClibc.new/libc/stdlib/realpath.c 2009-10-25 13:17:42.000000000 -0500
@@ -55,7 +55,7 @@
  char *max_path;
  char *new_path;
  size_t path_len;
- int readlinks = 0;
+ int readlinks = 0, allocated = 0;
 #ifdef S_IFLNK
  int link_len;
 #endif
@@ -68,6 +68,10 @@
  __set_errno(ENOENT);
  return NULL;
  }
+ if (!got_path) {
+ got_path = alloca(PATH_MAX);
+ allocated ++;
+ }
  /* Make a copy of the source path since we may need to modify it. */
  path_len = strlen(path);
  if (path_len >= PATH_MAX - 2) {
@@ -168,5 +172,6 @@
  new_path--;
  /* Make sure it's null terminated. */
  *new_path = '\0';
+ if (allocated) got_path = strdup(got_path);
  return got_path;
 }
diff -ur uClibc/include/stdlib.h uClibc2/include/stdlib.h
--- uClibc/include/stdlib.h 2008-09-11 11:17:43.000000000 -0500
+++ uClibc2/include/stdlib.h 2009-10-26 03:43:36.000000000 -0500
@@ -637,7 +637,7 @@
    name in RESOLVED.  */
 /* we choose to handle __resolved==NULL as crash :) */
 extern char *realpath (__const char *__restrict __name,
-       char *__restrict __resolved) __THROW __wur __nonnull((2));
+       char *__restrict __resolved) __wur;
 #endif
 
 



--
Latency is more important than throughput. It's that simple. - Linus Torvalds
_______________________________________________
uClibc mailing list
uClibc@...
http://lists.busybox.net/mailman/listinfo/uclibc

Re: Quick and dirty malloc() support for realpath.

by Mike Frysinger :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sunday 25 October 2009 15:19:49 Rob Landley wrote:
> - int readlinks = 0;
> + int readlinks = 0, allocated = 0;
> ...
> + if (!got_path) {
> + got_path = alloca(PATH_MAX);
> + allocated++;
> + }
> ...
> + if (allocated) got_path = strdup(got_path);

it doesnt make any sense to treat "allocated" as an integer that gets
incremented.  you're pointlessly forcing gcc to generate load/update/store
instructions when it only needs a store instruction.  i.e. use stdbool like
evolution intended.
-mike


_______________________________________________
uClibc mailing list
uClibc@...
http://lists.busybox.net/mailman/listinfo/uclibc

signature.asc (853 bytes) Download Attachment

Re: Quick and dirty malloc() support for realpath.

by Rob Landley :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Monday 26 October 2009 07:20:23 Mike Frysinger wrote:

> On Sunday 25 October 2009 15:19:49 Rob Landley wrote:
> > - int readlinks = 0;
> > + int readlinks = 0, allocated = 0;
> > ...
> > + if (!got_path) {
> > + got_path = alloca(PATH_MAX);
> > + allocated++;
> > + }
> > ...
> > + if (allocated) got_path = strdup(got_path);
>
> it doesnt make any sense to treat "allocated" as an integer that gets
> incremented.  you're pointlessly forcing gcc to generate load/update/store
> instructions when it only needs a store instruction.  i.e. use stdbool like
> evolution intended.

I did that because instruction sets that have an increment instruction can
produce smaller code by avoiding the 32 bit constant, and on something like
arm using a variable smaller than integer size can produce significantly
_larger_ code due to the masking and shifting the compiler generates to fake
the smaller sizes it hasn't got instructions for.  Also, in my experience
_Bool is about as real-world useful as the bit field notation with the colons,
and is really there to keep the language pedants and the c++ guys happy
without actually accomplishing much.  I've never seen it actually produce
better code.

But by all means let's test it:

gcc -v 2>&1 | tail -n 1
gcc version 4.3.3 (Ubuntu 4.3.3-5ubuntu4)

cat > hello.c << EOF
#include <stdio.h>

int main(int argc, char *argv[])
{
  int allocated=0;

  if (argc==2) allocated++;

  printf("allocated=%d\n", allocated);
}
EOF

gcc -Os -s hello.c
objdump -d a.out

000000000040052c <main>:
  40052c: 31 d2                 xor    %edx,%edx
  40052e: 83 ff 02             cmp    $0x2,%edi
  400531: be 3c 06 40 00       mov    $0x40063c,%esi
  400536: 0f 94 c2             sete   %dl
  400539: bf 01 00 00 00       mov    $0x1,%edi
  40053e: 31 c0                 xor    %eax,%eax
  400540: e9 db fe ff ff       jmpq   400420 <__printf_chk@plt>

And the optimizer's constant propogation expanded it to use the constant
assignment anyway.  But just to be sure let's switch to allocated=1 and...

000000000040052c <main>:
  40052c: 31 d2                 xor    %edx,%edx
  40052e: 83 ff 02             cmp    $0x2,%edi
  400531: be 3c 06 40 00       mov    $0x40063c,%esi
  400536: 0f 94 c2             sete   %dl
  400539: bf 01 00 00 00       mov    $0x1,%edi
  40053e: 31 c0                 xor    %eax,%eax
  400540: e9 db fe ff ff       jmpq   400420 <__printf_chk@plt>

Yup, the optimizer is actually coercing the two into producing identical code
on x86-64, which isn't particularly surprising.

Let's change the variable type to _Bool and...

000000000040052c <main>:
  40052c: 31 d2                 xor    %edx,%edx
  40052e: 83 ff 02             cmp    $0x2,%edi
  400531: be 3c 06 40 00       mov    $0x40063c,%esi
  400536: 0f 94 c2             sete   %dl
  400539: bf 01 00 00 00       mov    $0x1,%edi
  40053e: 31 c0                 xor    %eax,%eax
  400540: e9 db fe ff ff       jmpq   400420 <__printf_chk@plt>

Again, exactly the same code.

Now let's try arm, I've got a gcc 4.2.1 for armv5l lying around:

With the increment:
000083f8 <main>:
    83f8: e3500002 cmp r0, #2 ; 0x2
    83fc: 13a01000 movne r1, #0 ; 0x0
    8400: 03a01001 moveq r1, #1 ; 0x1
    8404: e59f0000 ldr r0, [pc, #0] ; 840c <.text+0xe0>
    8408: eaffffbe b 8308 <.text-0x24>
    840c: 00008420 andeq r8, r0, r0, lsr #8

With the integer assignment=1:

000083f8 <main>:
    83f8: e3500002 cmp r0, #2 ; 0x2
    83fc: 13a01000 movne r1, #0 ; 0x0
    8400: 03a01001 moveq r1, #1 ; 0x1
    8404: e59f0000 ldr r0, [pc, #0] ; 840c <.text+0xe0>
    8408: eaffffbe b 8308 <.text-0x24>
    840c: 00008420 andeq r8, r0, r0, lsr #8

And with the use of _Bool:

000083f8 <main>:
    83f8: e3500002 cmp r0, #2 ; 0x2
    83fc: 13a01000 movne r1, #0 ; 0x0
    8400: 03a01001 moveq r1, #1 ; 0x1
    8404: e59f0000 ldr r0, [pc, #0] ; 840c <.text+0xe0>
    8408: eaffffbe b 8308 <.text-0x24>
    840c: 00008420 andeq r8, r0, r0, lsr #8

It's really looking like gcc's optimizer is doing a whole lot of "not caring"
about the difference in this instance.  The constant propogation is dropping
the distinction so it's actually _less_ optimized than using the INC
instruction (on x86, anyway), but oh well.

But I can change it if it makes you happy.

Rob
--
Latency is more important than throughput. It's that simple. - Linus Torvalds

_______________________________________________
uClibc mailing list
uClibc@...
http://lists.busybox.net/mailman/listinfo/uclibc

Re: Quick and dirty malloc() support for realpath.

by Ricard Wanderlof :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


On Mon, 26 Oct 2009, Rob Landley wrote:

> ... Also, in my experience _Bool is about as real-world useful as the
> bit field notation with the colons, and is really there to keep the
> language pedants and the c++ guys happy without actually accomplishing
> much.  I've never seen it actually produce better code.

It can produce more readable, less error-prone C code though. We use
hardware register definitions such as

typedef struct {
   unsigned int x : 8;
   unsigned int y : 8;
   unsigned int control_bit : 1;
   unsigned int dummy1  : 15;
} reg_foo;

at great length for the C definitions for the registers in our chips, and
it really does avoid nasty errors that crop up when using shifting and
masking.

Just another opinion.

/Ricard
--
Ricard Wolf Wanderlöf                           ricardw(at)axis.com
Axis Communications AB, Lund, Sweden            www.axis.com
Phone +46 46 272 2016                           Fax +46 46 13 61 30
_______________________________________________
uClibc mailing list
uClibc@...
http://lists.busybox.net/mailman/listinfo/uclibc

Re: Quick and dirty malloc() support for realpath.

by Bernhard Reutner-Fischer :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, Oct 27, 2009 at 08:54:50AM +0100, Ricard Wanderlof wrote:

>
>On Mon, 26 Oct 2009, Rob Landley wrote:
>
>>... Also, in my experience _Bool is about as real-world useful as
>>the bit field notation with the colons, and is really there to keep
>>the language pedants and the c++ guys happy without actually
>>accomplishing much.  I've never seen it actually produce better
>>code.
>
>It can produce more readable, less error-prone C code though. We use
$ size libc/stdlib/realpath.o*
   text   data    bss    dec    hex filename
    555      0      0    555    22b libc/stdlib/realpath.os.oorig
    735      0      0    735    2df libc/stdlib/realpath.os.rob
    586      0      0    586    24a libc/stdlib/realpath.os

perhaps that would do as well and doesn't cost 180b (!) but about 31b..


diff --git a/include/stdlib.h b/include/stdlib.h
index e462c1c..6268995 100644
--- a/include/stdlib.h
+++ b/include/stdlib.h
@@ -659,7 +659,6 @@ extern char *canonicalize_file_name (__const char *__name)
      __THROW __nonnull ((1)) __wur;
 #endif
 
-#if defined __USE_BSD || defined __USE_XOPEN_EXTENDED
 /* Return the canonical absolute name of file NAME.  If RESOLVED is
    null, the result is malloc'd; otherwise, if the canonical name is
    PATH_MAX chars or more, returns null with `errno' set to
@@ -667,9 +666,7 @@ extern char *canonicalize_file_name (__const char *__name)
    returns the name in RESOLVED.  */
 /* we choose to handle __resolved==NULL as crash :) */
 extern char *realpath (__const char *__restrict __name,
-       char *__restrict __resolved) __THROW __wur __nonnull((2));
-#endif
-
+       char *__restrict __resolved) __THROW __wur;
 
 /* Shorthand for type of comparison functions.  */
 #ifndef __COMPAR_FN_T
diff --git a/libc/stdlib/realpath.c b/libc/stdlib/realpath.c
index 1a00c31..668cff6 100644
--- a/libc/stdlib/realpath.c
+++ b/libc/stdlib/realpath.c
@@ -36,13 +36,7 @@
 
 #define MAX_READLINKS 32
 
-#ifdef __STDC__
 char *realpath(const char *path, char got_path[])
-#else
-char *realpath(path, got_path)
-const char *path;
-char got_path[];
-#endif
 {
  char copy_path[PATH_MAX];
  /* use user supplied buffer directly - reduces stack usage */
@@ -63,6 +57,12 @@ char got_path[];
  __set_errno(ENOENT);
  return NULL;
  }
+ /* If got_path is NULL then we need to allocate it (SUSv4 base).
+ * Instead of guesstimating the exact length or realloc()ing
+ * to the later to be found real length we hand back a PATH_MAX
+ * buffer to the lazy.  */
+ if (got_path == NULL)
+ got_path = malloc(PATH_MAX);
  /* Make a copy of the source path since we may need to modify it. */
  path_len = strlen(path);
  if (path_len >= PATH_MAX - 2) {


_______________________________________________
uClibc mailing list
uClibc@...
http://lists.busybox.net/mailman/listinfo/uclibc

Re: Quick and dirty malloc() support for realpath.

by Chris Gray-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tuesday 27 October 2009 08:54:50 Ricard Wanderlof wrote:

> It can produce more readable, less error-prone C code though. We use
> hardware register definitions such as
>
> typedef struct {
>    unsigned int x : 8;
>    unsigned int y : 8;
>    unsigned int control_bit : 1;
>    unsigned int dummy1  : 15;
> } reg_foo;
>
> at great length for the C definitions for the registers in our chips, and
> it really does avoid nasty errors that crop up when using shifting and
> masking.

I gave up on bitfields the day I ran something like that through two compilers
and one decided that x was "obviously" the high-order byte of the word and the
other decided it was "obviously" the low-order one. Oh yes and they made
opposite decisions about signedness too, but at least you can control that (as
in your example). So now I just write little inlined functions or macros to do
the manipulations.

Just another other opinion. ;->

Chris


--
Chris Gray        /k/ Embedded Java Solutions      BE0809.435.306
Embedded & Mobile Java, OSGi    http://www.k-embedded-java.com/
chris.gray@...                             +32 3 216 0369

_______________________________________________
uClibc mailing list
uClibc@...
http://lists.busybox.net/mailman/listinfo/uclibc

Re: Quick and dirty malloc() support for realpath.

by Rob Landley :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tuesday 27 October 2009 02:54:50 Ricard Wanderlof wrote:

> On Mon, 26 Oct 2009, Rob Landley wrote:
> > ... Also, in my experience _Bool is about as real-world useful as the
> > bit field notation with the colons, and is really there to keep the
> > language pedants and the c++ guys happy without actually accomplishing
> > much.  I've never seen it actually produce better code.
>
> It can produce more readable, less error-prone C code though. We use
> hardware register definitions such as
>
> typedef struct {
>    unsigned int x : 8;
>    unsigned int y : 8;
>    unsigned int control_bit : 1;
>    unsigned int dummy1  : 15;
> } reg_foo;
>
> at great length for the C definitions for the registers in our chips, and
> it really does avoid nasty errors that crop up when using shifting and
> masking.

And introduces new nasty errors when "unspecified by the standard" details such
as endianness change.  A quick google produced:

http://bytes.com/topic/c/answers/168169-whats-memory-layout-bit-field-struct-
little-endian-big-endian-platform.
http://groups.google.com/group/gnu.gcc.help/browse_thread/thread/747918655affa5c0

Even today, gcc refuses to take a position on that one, just "determined by
ABI".

http://gcc.gnu.org/onlinedocs/gcc-4.4.2/gcc/Structures-unions-enumerations-
and-bit_002dfields-implementation.html#Structures-unions-enumerations-and-
bit_002dfields-implementation

You're fine if all the world's a VAX, but uClibc, busybox, and the Linux kernel
tend to like working on things like powerpc and mips.

Rob
--
Latency is more important than throughput. It's that simple. - Linus Torvalds
_______________________________________________
uClibc mailing list
uClibc@...
http://lists.busybox.net/mailman/listinfo/uclibc

Re: Quick and dirty malloc() support for realpath.

by Rob Landley :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tuesday 27 October 2009 05:51:05 Bernhard Reutner-Fischer wrote:

> On Tue, Oct 27, 2009 at 08:54:50AM +0100, Ricard Wanderlof wrote:
> >On Mon, 26 Oct 2009, Rob Landley wrote:
> >>... Also, in my experience _Bool is about as real-world useful as
> >>the bit field notation with the colons, and is really there to keep
> >>the language pedants and the c++ guys happy without actually
> >>accomplishing much.  I've never seen it actually produce better
> >>code.
> >
> >It can produce more readable, less error-prone C code though. We use
>
> $ size libc/stdlib/realpath.o*
>    text   data    bss    dec    hex filename
>     555      0      0    555    22b libc/stdlib/realpath.os.oorig
>     735      0      0    735    2df libc/stdlib/realpath.os.rob
>     586      0      0    586    24a libc/stdlib/realpath.os
>
> perhaps that would do as well and doesn't cost 180b (!) but about 31b..

The problem is that leaks memory if realpath ever returns failure, such as the
lines right after that patch:

        path_len = strlen(path);
        if (path_len >= PATH_MAX - 2) {
                __set_errno(ENAMETOOLONG);
                return NULL;
        }

And at least a half-dozen other "return NULL;" error cases later in the
function.

Possibly there's some way to not inline the alloca?  This was, as I said, a
quick and dirty fix.  Someday we should properly make it handle more than
PATH_MAX, but that's a big change and I'm just trying to get software using
Linux libc extensions (and now SUSv4) to work.

Rob
--
Latency is more important than throughput. It's that simple. - Linus Torvalds
_______________________________________________
uClibc mailing list
uClibc@...
http://lists.busybox.net/mailman/listinfo/uclibc

Re: Quick and dirty malloc() support for realpath.

by Ricard Wanderlof :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


On Tue, 27 Oct 2009, Rob Landley wrote:

>> On Mon, 26 Oct 2009, Rob Landley wrote:
>>> ... Also, in my experience _Bool is about as real-world useful as the
>>> bit field notation with the colons, and is really there to keep the
>>> language pedants and the c++ guys happy without actually accomplishing
>>> much.  I've never seen it actually produce better code.
>>
>> It can produce more readable, less error-prone C code though. We use
>> hardware register definitions such as
>>
>> typedef struct {
>>    unsigned int x : 8;
>>    unsigned int y : 8;
>>    unsigned int control_bit : 1;
>>    unsigned int dummy1  : 15;
>> } reg_foo;
>>
>> at great length for the C definitions for the registers in our chips, and
>> it really does avoid nasty errors that crop up when using shifting and
>> masking.
>
> And introduces new nasty errors when "unspecified by the standard" details such
> as endianness change.
> [ ... ]
> You're fine if all the world's a VAX, but uClibc, busybox, and the Linux kernel
> tend to like working on things like powerpc and mips.

You're right, probably not useful in the general case, I was really
commenting on the 'without actually accomplishing much' bit. In the case
of a specific processor or system-on-chip endianess doesn't change and
often the same compiler is used at all times. I'd agree there might be
issues for a general piece of software, such as uClibc.

/Ricard
--
Ricard Wolf Wanderlöf                           ricardw(at)axis.com
Axis Communications AB, Lund, Sweden            www.axis.com
Phone +46 46 272 2016                           Fax +46 46 13 61 30
_______________________________________________
uClibc mailing list
uClibc@...
http://lists.busybox.net/mailman/listinfo/uclibc

Re: Quick and dirty malloc() support for realpath.

by Mike Frysinger :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tuesday 27 October 2009 13:46:19 Chris Gray wrote:

> On Tuesday 27 October 2009 08:54:50 Ricard Wanderlof wrote:
> > It can produce more readable, less error-prone C code though. We use
> > hardware register definitions such as
> >
> > typedef struct {
> >    unsigned int x : 8;
> >    unsigned int y : 8;
> >    unsigned int control_bit : 1;
> >    unsigned int dummy1  : 15;
> > } reg_foo;
> >
> > at great length for the C definitions for the registers in our chips, and
> > it really does avoid nasty errors that crop up when using shifting and
> > masking.
>
> I gave up on bitfields the day I ran something like that through two
>  compilers and one decided that x was "obviously" the high-order byte of
>  the word and the other decided it was "obviously" the low-order one. Oh
>  yes and they made opposite decisions about signedness too, but at least
>  you can control that (as in your example). So now I just write little
>  inlined functions or macros to do the manipulations.
since i dont know the exact issue with the compilers in question (endianness
maybe ?), i dont think those drawbacks are applicable to Ricard's usage.  
doing bit fields for a specific platform means your code is constrained to
that platform, so portability isnt a real concern.  we do similar things for
hardware registers for the same reasons Ricard's cites.
-mike


_______________________________________________
uClibc mailing list
uClibc@...
http://lists.busybox.net/mailman/listinfo/uclibc

signature.asc (853 bytes) Download Attachment

Re: Quick and dirty malloc() support for realpath.

by Mike Frysinger :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tuesday 27 October 2009 15:44:42 Rob Landley wrote:

> On Tuesday 27 October 2009 05:51:05 Bernhard Reutner-Fischer wrote:
> > On Tue, Oct 27, 2009 at 08:54:50AM +0100, Ricard Wanderlof wrote:
> > >On Mon, 26 Oct 2009, Rob Landley wrote:
> > >>... Also, in my experience _Bool is about as real-world useful as
> > >>the bit field notation with the colons, and is really there to keep
> > >>the language pedants and the c++ guys happy without actually
> > >>accomplishing much.  I've never seen it actually produce better
> > >>code.
> > >
> > >It can produce more readable, less error-prone C code though. We use
> >
> > $ size libc/stdlib/realpath.o*
> >    text   data    bss    dec    hex filename
> >     555      0      0    555    22b libc/stdlib/realpath.os.oorig
> >     735      0      0    735    2df libc/stdlib/realpath.os.rob
> >     586      0      0    586    24a libc/stdlib/realpath.os
> >
> > perhaps that would do as well and doesn't cost 180b (!) but about 31b..
>
> The problem is that leaks memory if realpath ever returns failure, such as
>  the lines right after that patch:
>
>         path_len = strlen(path);
>         if (path_len >= PATH_MAX - 2) {
>                 __set_errno(ENAMETOOLONG);
>                 return NULL;
>         }
>
> And at least a half-dozen other "return NULL;" error cases later in the
> function.
>
> Possibly there's some way to not inline the alloca?  This was, as I said, a
> quick and dirty fix.  Someday we should properly make it handle more than
> PATH_MAX, but that's a big change and I'm just trying to get software using
> Linux libc extensions (and now SUSv4) to work.
while the memory leakage needs to be addressed, the answer isnt with alloca.  
the spec states that it must be via malloc(), but even ignoring that, it also
states that the caller must call free() on the returned pointer.  obviously
alloca-ed memory is not valid outside of the return of realpath() (which makes
me wonder how your code even worked in the first place), nor can you call
free() on it.
-mike


_______________________________________________
uClibc mailing list
uClibc@...
http://lists.busybox.net/mailman/listinfo/uclibc

signature.asc (853 bytes) Download Attachment

Re: Quick and dirty malloc() support for realpath.

by Mike Frysinger :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wednesday 28 October 2009 03:47:03 Mike Frysinger wrote:
> while the memory leakage needs to be addressed, the answer isnt with
>  alloca. the spec states that it must be via malloc(), but even ignoring
>  that, it also states that the caller must call free() on the returned
>  pointer.  obviously alloca-ed memory is not valid outside of the return of
>  realpath() (which makes me wonder how your code even worked in the first
>  place), nor can you call free() on it.

nm, i'm retarded.  i'd forgotten about the strdup() at the end.
-mike


_______________________________________________
uClibc mailing list
uClibc@...
http://lists.busybox.net/mailman/listinfo/uclibc

signature.asc (853 bytes) Download Attachment

Re: Quick and dirty malloc() support for realpath.

by Mike Frysinger :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tuesday 27 October 2009 15:44:42 Rob Landley wrote:

> On Tuesday 27 October 2009 05:51:05 Bernhard Reutner-Fischer wrote:
> > On Tue, Oct 27, 2009 at 08:54:50AM +0100, Ricard Wanderlof wrote:
> > >On Mon, 26 Oct 2009, Rob Landley wrote:
> > >>... Also, in my experience _Bool is about as real-world useful as
> > >>the bit field notation with the colons, and is really there to keep
> > >>the language pedants and the c++ guys happy without actually
> > >>accomplishing much.  I've never seen it actually produce better
> > >>code.
> > >
> > >It can produce more readable, less error-prone C code though. We use
> >
> > $ size libc/stdlib/realpath.o*
> >    text   data    bss    dec    hex filename
> >     555      0      0    555    22b libc/stdlib/realpath.os.oorig
> >     735      0      0    735    2df libc/stdlib/realpath.os.rob
> >     586      0      0    586    24a libc/stdlib/realpath.os
> >
> > perhaps that would do as well and doesn't cost 180b (!) but about 31b..
>
> The problem is that leaks memory if realpath ever returns failure, such as
>  the lines right after that patch:
our friend goto solves the leak.  stick it in the middle of the file to
maximize short forward/backward jumps and we only get a small increase:

    578       0       0     578     242 libc/stdlib/realpath.os.orig
    609       0       0     609     261 libc/stdlib/realpath.os

diff --git a/libc/stdlib/realpath.c b/libc/stdlib/realpath.c
index 1a00c31..146d1d8 100644
--- a/libc/stdlib/realpath.c
+++ b/libc/stdlib/realpath.c
@@ -47,8 +47,7 @@ char got_path[];
  char copy_path[PATH_MAX];
  /* use user supplied buffer directly - reduces stack usage */
  /* char got_path[PATH_MAX]; */
- char *max_path;
- char *new_path;
+ char *max_path, *new_path, *allocated_path;
  size_t path_len;
  int readlinks = 0;
 #ifdef S_IFLNK
@@ -72,12 +71,13 @@ char got_path[];
  /* Copy so that path is at the end of copy_path[] */
  strcpy(copy_path + (PATH_MAX-1) - path_len, path);
  path = copy_path + (PATH_MAX-1) - path_len;
+ allocated_path = got_path ? NULL : malloc(PATH_MAX);
  max_path = got_path + PATH_MAX - 2; /* points to last non-NUL char */
  new_path = got_path;
  if (*path != '/') {
  /* If it's a relative pathname use getcwd for starters. */
  if (!getcwd(new_path, PATH_MAX - 1))
- return NULL;
+ goto err;
  new_path += strlen(new_path);
  if (new_path[-1] != '/')
  *new_path++ = '/';
@@ -114,6 +114,8 @@ char got_path[];
  while (*path != '\0' && *path != '/') {
  if (new_path > max_path) {
  __set_errno(ENAMETOOLONG);
+ err:
+ free(allocated_path);
  return NULL;
  }
  *new_path++ = *path++;
@@ -122,7 +124,7 @@ char got_path[];
  /* Protect against infinite loops. */
  if (readlinks++ > MAX_READLINKS) {
  __set_errno(ELOOP);
- return NULL;
+ goto err;
  }
  path_len = strlen(path);
  /* See if last (so far) pathname component is a symlink. */
@@ -133,13 +135,13 @@ char got_path[];
  if (link_len < 0) {
  /* EINVAL means the file exists but isn't a symlink. */
  if (errno != EINVAL) {
- return NULL;
+ goto err;
  }
  } else {
  /* Safe sex check. */
  if (path_len + link_len >= PATH_MAX - 2) {
  __set_errno(ENAMETOOLONG);
- return NULL;
+ goto err;
  }
  /* Note: readlink doesn't add the null byte. */
  /* copy_path[link_len] = '\0'; - we don't need it too */
-mike


_______________________________________________
uClibc mailing list
uClibc@...
http://lists.busybox.net/mailman/listinfo/uclibc

signature.asc (853 bytes) Download Attachment

RE: Quick and dirty malloc() support for realpath.

by Peter Kjellerstedt :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> -----Original Message-----
> From: uclibc-bounces@... [mailto:uclibc-bounces@...] On
> Behalf Of Mike Frysinger
> Sent: den 28 oktober 2009 09:02
> To: uclibc@...
> Subject: Re: Quick and dirty malloc() support for realpath.

[ cut ]

> @@ -114,6 +114,8 @@ char got_path[];
>   while (*path != '\0' && *path != '/') {
>   if (new_path > max_path) {
>   __set_errno(ENAMETOOLONG);
> + err:
> + free(allocated_path);
>   return NULL;

Fore readability, wouldn't it be better to put the three lines
above at the end of the function, and just put another goto err
here? Or is there some other reason to have the error exit path
in the middle of the function?

//Peter
_______________________________________________
uClibc mailing list
uClibc@...
http://lists.busybox.net/mailman/listinfo/uclibc

Re: Quick and dirty malloc() support for realpath.

by Mike Frysinger :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wednesday 28 October 2009 04:57:01 Peter Kjellerstedt wrote:

> From: Mike Frysinger
>
> [ cut ]
>
> > @@ -114,6 +114,8 @@ char got_path[];
> >   while (*path != '\0' && *path != '/') {
> >   if (new_path > max_path) {
> >   __set_errno(ENAMETOOLONG);
> > + err:
> > + free(allocated_path);
> >   return NULL;
>
> Fore readability, wouldn't it be better to put the three lines
> above at the end of the function, and just put another goto err
> here? Or is there some other reason to have the error exit path
> in the middle of the function?
i stated the reason for doing this in the part you "[ cut ]".  it was all of
two sentences :P.
-mike


_______________________________________________
uClibc mailing list
uClibc@...
http://lists.busybox.net/mailman/listinfo/uclibc

signature.asc (853 bytes) Download Attachment

Re: Quick and dirty malloc() support for realpath.

by Bernhard Reutner-Fischer :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Oct 28, 2009 at 05:12:55AM -0400, Mike Frysinger wrote:
>On Wednesday 28 October 2009 04:57:01 Peter Kjellerstedt wrote:
>> From: Mike Frysinger

better but the prototype part is missing (and a note that this
particular func now adheres to SUSv4 base, eventually)
_______________________________________________
uClibc mailing list
uClibc@...
http://lists.busybox.net/mailman/listinfo/uclibc

RE: Quick and dirty malloc() support for realpath.

by Peter Kjellerstedt :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> -----Original Message-----
> From: Mike Frysinger [mailto:vapier@...]
> Sent: den 28 oktober 2009 10:13
> To: Peter Kjellerstedt
> Cc: uclibc@...
> Subject: Re: Quick and dirty malloc() support for realpath.
>
> On Wednesday 28 October 2009 04:57:01 Peter Kjellerstedt wrote:
> > From: Mike Frysinger
> >
> > [ cut ]
> >
> > > @@ -114,6 +114,8 @@ char got_path[];
> > >   while (*path != '\0' && *path != '/') {
> > >   if (new_path > max_path) {
> > >   __set_errno(ENAMETOOLONG);
> > > + err:
> > > + free(allocated_path);
> > >   return NULL;
> >
> > Fore readability, wouldn't it be better to put the three lines
> > above at the end of the function, and just put another goto err
> > here? Or is there some other reason to have the error exit path
> > in the middle of the function?
>
> i stated the reason for doing this in the part you "[ cut ]".  it was
> all of two sentences :P.
> -mike

Well, it wasn't in the code which explains why I didn't see it. ;)
 
[ Re-add the cut part: ]
> > > our friend goto solves the leak.  stick it in the middle of the file to
> > > maximize short forward/backward jumps and we only get a small increase:

So, in effect it is an architecture specific optimization.
With that kind of optimization, isn't it a sign that the
function is too long and should be split?

In any case, there should be a comment explaining why the exit path
is in the middle of the function, or someone is bound to rearrange
the code for readability at a later stage.

//Peter

_______________________________________________
uClibc mailing list
uClibc@...
http://lists.busybox.net/mailman/listinfo/uclibc

Re: Quick and dirty malloc() support for realpath.

by Rob Landley :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wednesday 28 October 2009 02:47:03 Mike Frysinger wrote:

> On Tuesday 27 October 2009 15:44:42 Rob Landley wrote:
> > On Tuesday 27 October 2009 05:51:05 Bernhard Reutner-Fischer wrote:
> > > On Tue, Oct 27, 2009 at 08:54:50AM +0100, Ricard Wanderlof wrote:
> > > >On Mon, 26 Oct 2009, Rob Landley wrote:
> > > >>... Also, in my experience _Bool is about as real-world useful as
> > > >>the bit field notation with the colons, and is really there to keep
> > > >>the language pedants and the c++ guys happy without actually
> > > >>accomplishing much.  I've never seen it actually produce better
> > > >>code.
> > > >
> > > >It can produce more readable, less error-prone C code though. We use
> > >
> > > $ size libc/stdlib/realpath.o*
> > >    text   data    bss    dec    hex filename
> > >     555      0      0    555    22b libc/stdlib/realpath.os.oorig
> > >     735      0      0    735    2df libc/stdlib/realpath.os.rob
> > >     586      0      0    586    24a libc/stdlib/realpath.os
> > >
> > > perhaps that would do as well and doesn't cost 180b (!) but about 31b..
> >
> > The problem is that leaks memory if realpath ever returns failure, such
> > as the lines right after that patch:
> >
> >         path_len = strlen(path);
> >         if (path_len >= PATH_MAX - 2) {
> >                 __set_errno(ENAMETOOLONG);
> >                 return NULL;
> >         }
> >
> > And at least a half-dozen other "return NULL;" error cases later in the
> > function.
> >
> > Possibly there's some way to not inline the alloca?  This was, as I said,
> > a quick and dirty fix.  Someday we should properly make it handle more
> > than PATH_MAX, but that's a big change and I'm just trying to get
> > software using Linux libc extensions (and now SUSv4) to work.
>
> while the memory leakage needs to be addressed, the answer isnt with
> alloca. the spec states that it must be via malloc(), but even ignoring
> that, it also states that the caller must call free() on the returned
> pointer.

Please read the patch I submitted.

@@ -168,5 +172,6 @@
                new_path--;
        /* Make sure it's null terminated. */
        *new_path = '\0';
+       if (allocated) got_path = strdup(got_path);
        return got_path;
 }

> obviously alloca-ed memory is not valid outside of the return of
> realpath() (which makes me wonder how your code even worked in the first
> place), nor can you call free() on it.

You believed I wasn't aware of this?

> -mike

Rob
--
Latency is more important than throughput. It's that simple. - Linus Torvalds
_______________________________________________
uClibc mailing list
uClibc@...
http://lists.busybox.net/mailman/listinfo/uclibc

Re: Quick and dirty malloc() support for realpath.

by Mike Frysinger :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wednesday 28 October 2009 07:57:50 Bernhard Reutner-Fischer wrote:
> On Wed, Oct 28, 2009 at 05:12:55AM -0400, Mike Frysinger wrote:
> >On Wednesday 28 October 2009 04:57:01 Peter Kjellerstedt wrote:
> >> From: Mike Frysinger
>
> better but the prototype part is missing (and a note that this
> particular func now adheres to SUSv4 base, eventually)

i wasnt proposing dropping your changes, i just wrote things from scratch.  
the point was to highlight the changes.

did you want me to commit then ?
-mike


_______________________________________________
uClibc mailing list
uClibc@...
http://lists.busybox.net/mailman/listinfo/uclibc

signature.asc (853 bytes) Download Attachment
< Prev | 1 - 2 | Next >