[bug #25294] assertion failure on dangling symlink to //

View: New views
16 Messages — Rating Filter:   Alert me  

[bug #25294] assertion failure on dangling symlink to //

by Mario Castelán Castro :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


URL:
  <http://savannah.gnu.org/bugs/?25294>

                 Summary: assertion failure on dangling symlink to //
                 Project: findutils
            Submitted by: ericb
            Submitted on: Sat 10 Jan 2009 04:44:14 PM MST
                Category: find
                Severity: 3 - Normal
              Item Group: Wrong result
                  Status: None
                 Privacy: Public
             Assigned to: None
         Originator Name: Eric Blake
        Originator Email: ebb9@...
             Open/Closed: Open
         Discussion Lock: Any
                 Release: 4.5.3
           Fixed Release: None

    _______________________________________________________

Details:

On systems where d_type is valid, find is triggering an assertion (I'm not
sure if this assertion also requires that // be distinct from /, in which case
the assertion may be specific to cygwin 1.7, since earlier cygwin lack d_type
and other systems don't have distinct //):

$ mkdir example
$ ln -s /nowhere example/n
$ find -L example
example
example/n

but the following asserts:

$ rm example/n
$ ln -s //nowhere example/n
$ find -L example
example
assertion "state.type != 0" failed: file
"/usr/src/findutils-4.5.3-1/src/findutils-4.5.3/find/ftsfind.c", line 475,
function: consider_visiting
Aborted (core dumped)


Corinna Vinschen provided this analysis:

The assertion is basically

  if (ent->fts_info == FTS_NSOK || ent->fts_info == FTS_NS)
    assert (state.type != 0);

state.type is set in the calling function find() like this:

  while ( (ent=fts_read(p)) != NULL )
    {
      state.have_type = !!ent->fts_statp->st_mode;
      state.type = state.have_type ? ent->fts_statp->st_mode : 0;
    }

which is a bug, AFAICS.  The reason is that per the fts_read man page
the value of ent->fts_statp is undefined if ent->fts_info is FTS_NSOK
or FTS_NS.  So the values of state.have_type and consequentially
state.type are undefined as well and the above assertion makes no sense.




    _______________________________________________________

Reply to this item at:

  <http://savannah.gnu.org/bugs/?25294>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.gnu.org/





[bug #25294] assertion failure on dangling symlink to //

by Mario Castelán Castro :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Follow-up Comment #1, bug #25294 (project findutils):

More info:

The bug first surfaced in commit acb82fe, Nov 30 2008, at the point when Jim
provided a patch to gnulib fts to do fewer stat's when d_type is reliable.

The problem starts when readdir recognizes that it is too expensive to decide
what the type of a broken symlink is (at least on cygwin), and returns
DT_UNKNOWN (defined as 0).  Next, fts calls fts_stat, which fails because the
symlink is broken, but on cygwin, a symlink to "//nowhere" fails with the
nonstandard errno==ENOSHARE rather than the more typical ENOENT, so no lstat
is attempted to see if the file might be a dangling symlink.

So the bug might be in fts.c, line 1540, for not recognizing all cases of bad
symlinks (even ignoring cygwin's nonstandard ENOSHARE, what happens for a
looping symlink that returns ELOOP?).  Would a readlink() be better than
lstat() to determine that a symlink exists but can't be followed?


    _______________________________________________________

Reply to this item at:

  <http://savannah.gnu.org/bugs/?25294>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.gnu.org/





Re: [bug #25294] assertion failure on dangling symlink to //

by Eric Blake :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

According to Eric Blake on 1/10/2009 6:58 PM:
> Follow-up Comment #1, bug #25294 (project findutils):
>
> More info:
>
> The bug first surfaced in commit acb82fe, Nov 30 2008, at the point when Jim
> provided a patch to gnulib fts to do fewer stat's when d_type is reliable.

Jim, you may want to take a look at this assertion in find, caused by your
improvements to gnulib fts to avoid excess stat's on new enough cygwin.

>
> The problem starts when readdir recognizes that it is too expensive to decide
> what the type of a broken symlink is (at least on cygwin), and returns
> DT_UNKNOWN (defined as 0).  Next, fts calls fts_stat, which fails because the
> symlink is broken, but on cygwin, a symlink to "//nowhere" fails with the
> nonstandard errno==ENOSHARE rather than the more typical ENOENT, so no lstat
> is attempted to see if the file might be a dangling symlink.
>
> So the bug might be in fts.c, line 1540, for not recognizing all cases of bad
> symlinks (even ignoring cygwin's nonstandard ENOSHARE, what happens for a
> looping symlink that returns ELOOP?).  Would a readlink() be better than
> lstat() to determine that a symlink exists but can't be followed?

Currently, the ELOOP case triggers fts_info of FTS_NS rather than
FTS_SLNONE.  On seeing FTS_NS, find's consider_visiting calls
symlink_loop, which performs a stat, but does not adjust state.have_stat
to record this fact, leading to potentially duplicated stat's.

- --
Don't work too hard, make some time for fun as well!

Eric Blake             ebb9@...
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAklpWZoACgkQ84KuGfSFAYBXqACdHkhzXQlFP9GKMSSDhC5cvIDj
+6IAoKc7JSeDuUDPXwUtd/lP59j8o6Uv
=bczA
-----END PGP SIGNATURE-----



[bug #25294] assertion failure on dangling symlink to //

by Mario Castelán Castro :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Follow-up Comment #2, bug #25294 (project findutils):

The assertion failure can also be reproduced with a loop:

$ mkdir example
$ ln -s loop example/loop
$ find -L example
example
find: `example/loop': Too many levels of symbolic links
assertion "state.type != 0" failed: file "ftsfind.c", line 475, function:
consider_visiting
Aborted (core dumped)

As I see it, this only depends on having a system with d_type support, but on
a file system where readdir returns DT_UNKNOWN on at least dangling symlinks.
Therefore, I think that it should be possible to find a file system under
Linux that can reproduce this failure, rather than relying on the
cygwin-specific behavior in stat'ing a dangling symlink to "//nowhere".


    _______________________________________________________

Reply to this item at:

  <http://savannah.gnu.org/bugs/?25294>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.gnu.org/





Re: [bug #25294] assertion failure on dangling symlink to //

by Jim Meyering :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Eric Blake <ebb9@...> wrote:

> According to Eric Blake on 1/10/2009 6:58 PM:
>> Follow-up Comment #1, bug #25294 (project findutils):
>>
>> More info:
>>
>> The bug first surfaced in commit acb82fe, Nov 30 2008, at the point when Jim
>> provided a patch to gnulib fts to do fewer stat's when d_type is reliable.
>
> Jim, you may want to take a look at this assertion in find, caused by your
> improvements to gnulib fts to avoid excess stat's on new enough cygwin.
>
>>
>> The problem starts when readdir recognizes that it is too expensive to decide
>> what the type of a broken symlink is (at least on cygwin), and returns
>> DT_UNKNOWN (defined as 0).  Next, fts calls fts_stat, which fails because the
>> symlink is broken, but on cygwin, a symlink to "//nowhere" fails with the
>> nonstandard errno==ENOSHARE rather than the more typical ENOENT, so no lstat
>> is attempted to see if the file might be a dangling symlink.
>>
>> So the bug might be in fts.c, line 1540, for not recognizing all cases of bad
>> symlinks (even ignoring cygwin's nonstandard ENOSHARE, what happens for a
>> looping symlink that returns ELOOP?).  Would a readlink() be better than
>> lstat() to determine that a symlink exists but can't be followed?
>
> Currently, the ELOOP case triggers fts_info of FTS_NS rather than
> FTS_SLNONE.  On seeing FTS_NS, find's consider_visiting calls
> symlink_loop, which performs a stat, but does not adjust state.have_stat
> to record this fact, leading to potentially duplicated stat's.

I tried your "find -L dir-containing-loop" example
on ext3, tmpfs, and xfs, and it appears d_type is always DT_LNK,
since the command didn't trigger a failed assertion:

    $ mkdir e && ln -s loop e/loop && find -L e
    e
    find: `e/loop': Too many levels of symbolic links
    [Exit 1]

so what do you think about changing the test from

                        if (errno == ENOENT

to e.g.,

                        if (FAILED_STAT_ERRNO_MAY_INDICATE_SYMLINK (errno)

Where the macro is defined like this:

#ifdef ENOSHARE
# define FAILED_STAT_ERRNO_MAY_INDICATE_SYMLINK(Errno) \
    ((Errno) == ENOENT || (Errno) == ENOSHARE)
#else
# define FAILED_STAT_ERRNO_MAY_INDICATE_SYMLINK(Errno) ((Errno) == ENOENT)
#endif

I checked a few Linux/GNU systems and found no ENOSHARE definition.
I.e,. how about this patch?


From e56aaaa61c09cb915e2a3f24f9824be23ecd03fc Mon Sep 17 00:00:00 2001
From: Jim Meyering <meyering@...>
Date: Sun, 11 Jan 2009 22:57:37 +0100
Subject: [PATCH] fts: avoid assertion failure on Cygwin with find -L vs symlink loop

* lib/fts.c (FAILED_STAT_ERRNO_MAY_INDICATE_SYMLINK): Define.
(fts_stat): Use it.
Report and diagnosis by Eric Blake.
Details in http://savannah.gnu.org/bugs/?25294.
---
 lib/fts.c |   13 +++++++++++--
 1 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/lib/fts.c b/lib/fts.c
index 836c179..c3dc855 100644
--- a/lib/fts.c
+++ b/lib/fts.c
@@ -1,6 +1,6 @@
 /* Traverse a file hierarchy.

-   Copyright (C) 2004, 2005, 2006, 2007, 2008 Free Software Foundation, Inc.
+   Copyright (C) 2004-2009 Free Software Foundation, Inc.

    This program is free software: you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
@@ -130,6 +130,15 @@ enum
 # define D_INO(dp) NOT_AN_INODE_NUMBER
 #endif

+/* Applying Cygwin's stat to certain symlinks can evoke failure with a
+   nonstandard errno value of ENOSHARE.  Work around it. */
+#ifdef ENOSHARE
+# define FAILED_STAT_ERRNO_MAY_INDICATE_SYMLINK(Errno) \
+    ((Errno) == ENOENT || (Errno) == ENOSHARE)
+#else
+# define FAILED_STAT_ERRNO_MAY_INDICATE_SYMLINK(Errno) ((Errno) == ENOENT)
+#endif
+
 /* If there are more than this many entries in a directory,
    and the conditions mentioned below are satisfied, then sort
    the entries on inode number before any further processing.  */
@@ -1544,7 +1553,7 @@ fts_stat(FTS *sp, register FTSENT *p, bool follow)
  if (ISSET(FTS_LOGICAL) || follow) {
  if (stat(p->fts_accpath, sbp)) {
  saved_errno = errno;
- if (errno == ENOENT
+ if (FAILED_STAT_ERRNO_MAY_INDICATE_SYMLINK (errno)
     && lstat(p->fts_accpath, sbp) == 0) {
  __set_errno (0);
  return (FTS_SLNONE);
--
1.6.1.155.g1b01da



Re: [bug #25294] assertion failure on dangling symlink to //

by Eric Blake :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

According to Jim Meyering on 1/11/2009 3:06 PM:
> I tried your "find -L dir-containing-loop" example
> on ext3, tmpfs, and xfs, and it appears d_type is always DT_LNK,

Is there any file system on Linux where you can always get DT_UNKNOWN?  If
not, I'll keep playing with it on the cygwin side of things.

> so what do you think about changing the test to e.g.,
>
> if (FAILED_STAT_ERRNO_MAY_INDICATE_SYMLINK (errno)
>
> Where the macro is defined like this:
>
> #ifdef ENOSHARE
> # define FAILED_STAT_ERRNO_MAY_INDICATE_SYMLINK(Errno) \
>     ((Errno) == ENOENT || (Errno) == ENOSHARE)
> #else
> # define FAILED_STAT_ERRNO_MAY_INDICATE_SYMLINK(Errno) ((Errno) == ENOENT)
> #endif

I already did something like that patch that on my copy of findutils on
cygwin; the //noshare bug went away, but the assertion failure on a loop
still remains.  But this particular patch seems like overkill, now that
cygwin no longer returns ENOSHARE (as of this morning, the developers
agreed that ENOSHARE doesn't make sense any more; it predates the time
when cygwin had support for //server syntax):

http://cygwin.com/ml/cygwin/2009-01/msg00284.html
http://cygwin.com/ml/cygwin-cvs/2009-q1/msg00014.html

>
> I checked a few Linux/GNU systems and found no ENOSHARE definition.
> I.e,. how about this patch?

Let's not bother with it.  Cygwin 1.5.x is immune (no d_type support), and
since cygwin 1.7.0 is not yet a stable release, and has fixed the ENOSHARE
issue already, there are no stable cygwin releases that need the workaround.

On the other hand, what did you think of my idea of using readlink()
instead of lstat() to determine if a file is a symlink after stat() fails
(and when d_type is unavailable or DT_UNKNOWN)?  Is that any more
reliable?  Or would the slight semantics changes introduced by treating
things like ELOOP as FTS_SLNONE instead of FTS_NS be an issue?

- --
Don't work too hard, make some time for fun as well!

Eric Blake             ebb9@...
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAklqdPQACgkQ84KuGfSFAYCN/wCeIlmEdlflfgHBpCTbOVLefdeY
24gAn2RP4dpeP/cY96/0FNenJBmlGlwe
=A4BQ
-----END PGP SIGNATURE-----



Re: [bug #25294] assertion failure on dangling symlink to //

by Jim Meyering :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Eric Blake <ebb9@...> wrote:
> According to Jim Meyering on 1/11/2009 3:06 PM:
>> I tried your "find -L dir-containing-loop" example
>> on ext3, tmpfs, and xfs, and it appears d_type is always DT_LNK,
>
> Is there any file system on Linux where you can always get DT_UNKNOWN?  If
> not, I'll keep playing with it on the cygwin side of things.

Sure.  reiserfs (3 at least) has no d_type information.

>> so what do you think about changing the test to e.g.,
>>
>> if (FAILED_STAT_ERRNO_MAY_INDICATE_SYMLINK (errno)
>>
>> Where the macro is defined like this:
>>
>> #ifdef ENOSHARE
>> # define FAILED_STAT_ERRNO_MAY_INDICATE_SYMLINK(Errno) \
>>     ((Errno) == ENOENT || (Errno) == ENOSHARE)
>> #else
>> # define FAILED_STAT_ERRNO_MAY_INDICATE_SYMLINK(Errno) ((Errno) == ENOENT)
>> #endif
>
> I already did something like that patch that on my copy of findutils on
> cygwin; the //noshare bug went away, but the assertion failure on a loop
> still remains.  But this particular patch seems like overkill, now that
> cygwin no longer returns ENOSHARE (as of this morning, the developers

Good!

> agreed that ENOSHARE doesn't make sense any more; it predates the time
> when cygwin had support for //server syntax):
>
> http://cygwin.com/ml/cygwin/2009-01/msg00284.html
> http://cygwin.com/ml/cygwin-cvs/2009-q1/msg00014.html
>
>> I checked a few Linux/GNU systems and found no ENOSHARE definition.
>> I.e,. how about this patch?
>
> Let's not bother with it.  Cygwin 1.5.x is immune (no d_type support), and
> since cygwin 1.7.0 is not yet a stable release, and has fixed the ENOSHARE
> issue already, there are no stable cygwin releases that need the workaround.

All the better.

> On the other hand, what did you think of my idea of using readlink()
> instead of lstat() to determine if a file is a symlink after stat() fails
> (and when d_type is unavailable or DT_UNKNOWN)?  Is that any more

Documented semantics of FTS_SLNONE require that fts_statp point
to valid stat data.

> reliable?  Or would the slight semantics changes introduced by treating
> things like ELOOP as FTS_SLNONE instead of FTS_NS be an issue?

It might be.  That change would make at least tests/du/long-sloop
fail in coreutils' test suite.



Re: [bug #25294] assertion failure on dangling symlink to //

by James Youngman-5 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sun, Jan 11, 2009 at 10:38 PM, Eric Blake <ebb9@...> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> According to Jim Meyering on 1/11/2009 3:06 PM:
>> I tried your "find -L dir-containing-loop" example
>> on ext3, tmpfs, and xfs, and it appears d_type is always DT_LNK,
>
> Is there any file system on Linux where you can always get DT_UNKNOWN?  If
> not, I'll keep playing with it on the cygwin side of things.

Some types of files in AFS can permanently give that result, I
understand, due to the slightly different nature of permissions in
AFS.

James.



[bug #25294] assertion failure on dangling symlink to //

by Mario Castelán Castro :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Follow-up Comment #3, bug #25294 (project findutils):

I would have expected ent->fts_info == FTS_SLNONE for the loop case.  That
implies that the comment on line 472 could well be misleading.


I modified fts.c to always set d_type to DT_UNKNOWN but was not able to
demonstrate a problem.


    _______________________________________________________

Reply to this item at:

  <http://savannah.gnu.org/bugs/?25294>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.gnu.org/




[bug #25294] assertion failure on dangling symlink to //

by Mario Castelán Castro :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Follow-up Comment #4, bug #25294 (project findutils):

Check-in b445af98c22cd2d13673e2699a77ab728a7073b0 changed this code a bit, as
did 0b1acd3358466b02f32baf9423665113dc933492.

Is the problem still reproducible from the git master branch (currently at
89da37ea3b25fd7c95d831a6f319068c99c7406b)?

    _______________________________________________________

Reply to this item at:

  <http://savannah.gnu.org/bugs/?25294>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.gnu.org/




[bug #25294] assertion failure on dangling symlink to //

by Mario Castelán Castro :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Update of bug #25294 (project findutils):

                  Status:                    None => Need Info              
             Assigned to:                    None => ericb                  


    _______________________________________________________

Reply to this item at:

  <http://savannah.gnu.org/bugs/?25294>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.gnu.org/




[bug #25294] assertion failure on dangling symlink to //

by Mario Castelán Castro :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Update of bug #25294 (project findutils):

                  Status:               Need Info => Fixed                  
             Open/Closed:                    Open => Closed                
           Fixed Release:                    None => 4.5.4                  

    _______________________________________________________

Follow-up Comment #5:

Both examples now avoid assertion failures on cygwin with stock findutils
4.5.4, so I'm closing the report as fixed:

$ mkdir example
$ ln -s /nowhere example/n
$ find -L example
example
example/n
$ rm example/n
$ ln -s //nowhere example/n
$ find -L example
example
example/n
$ rm example/n
$ ln -s loop example/loop
$ find -L example
example
find: `example/loop': Too many levels of symbolic links
$


    _______________________________________________________

Reply to this item at:

  <http://savannah.gnu.org/bugs/?25294>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.gnu.org/




[bug #25294] assertion failure on dangling symlink to //

by Mario Castelán Castro :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Additional Item Attachment, bug #25294 (project findutils):

File name: firefox.trace.log              Size:61 KB
File name: vacuum_mozillas.sh             Size:0 KB


    _______________________________________________________

Reply to this item at:

  <http://savannah.gnu.org/bugs/?25294>

_______________________________________________
  Messaggio inviato con/da Savannah
  http://savannah.gnu.org/




[bug #25294] assertion failure on dangling symlink to //

by Mario Castelán Castro :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Follow-up Comment #6, bug #25294 (project findutils):

please check out the attachments
the bash script is the one that fails. the log is the strace log

    _______________________________________________________

Reply to this item at:

  <http://savannah.gnu.org/bugs/?25294>

_______________________________________________
  Messaggio inviato con/da Savannah
  http://savannah.gnu.org/




[bug #25294] assertion failure on dangling symlink to //

by Mario Castelán Castro :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Follow-up Comment #7, bug #25294 (project findutils):

federico, please indicate what version of find you are using.  Also, can you
proproduce the problem with a script which creates a suitable example?  (I
don't have your ~/.mozilla directory here!)



I'm suspicious of this part of the system call trace:

--- SIGCHLD (Child exited) @ 0 (0) ---
newfstatat(7, "cookies.sqlite-journal", 0x131ab00, AT_SYMLINK_NOFOLLOW) = -1
ENOENT (No such file or directory)
lstat("cookies.sqlite-journal", 0x7fff7880b330) = -1 ENOENT (No such file or
directory)
open("/usr/share/locale/locale.alias", O_RDONLY) = 6


... because I'm not convinced that fd 7 corresponds to the working directory,
so the lstat could fail or return the wrong information.

My guess is that the lstat call is inside symlink_loop() which is called at
line 462 of ftsfind.c (i.e. for the FTS_NS case).


    _______________________________________________________

Reply to this item at:

  <http://savannah.gnu.org/bugs/?25294>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.gnu.org/




[bug #25294] assertion failure on dangling symlink to //

by Mario Castelán Castro :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Follow-up Comment #8, bug #25294 (project findutils):

the version is 4.5.4

I'm sorry but I can't create a reproducible example but mine

What I've noted while trying was that it /seems/ to have something to do with
disk cache (as it stops failing after a couple of times but it fails again if
you echo 3 > /p/s/vm/drop_caches) AND that executing a similar script on
/usr/lib/, executing ldd on a subset of .so, DID NOT fail, so that it could be
related to the fact that, when my script runs, it modifies the files it
matches (by vacuuming sqlite databases)
just speculations, though

    _______________________________________________________

Reply to this item at:

  <http://savannah.gnu.org/bugs/?25294>

_______________________________________________
  Messaggio inviato con/da Savannah
  http://savannah.gnu.org/