FreeBSD 7.2 + NFS + nullfs + unlink + fstat = Stale NFS File Handle

View: New views
4 Messages — Rating Filter:   Alert me  

FreeBSD 7.2 + NFS + nullfs + unlink + fstat = Stale NFS File Handle

by Linda Messerschmidt :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

We have encountered a problem with a weird behavior when NFS and
nullfs are combined and a program creates, unlinks, and then fstats a
file in the resulting directory.

After encountering this problem in the wild, I wrote a quick little C
program.  It creates a file, unlinks the file, and then fstat's the
open file descriptor.  The results:

UFS: OK
NFS: OK
UFS+NULLFS: OK
NFS+NULLFS: fstat returns ESTALE

The UFS test is just run in /tmp.  All others are run in /mnt (with
umounts in between).

The NFS setup looks like this:

client# mount -o tcp server:/export/example /mnt

The UFS+NULLFS setup looks like this:

client# mount -t nullfs /tmp /mnt

The NFS+NULLFS mount setup looks like this:

client# mount -o tcp server:/export /export
client# mount -t nullfs /export/example /mnt

As far as I understand, this behavior should be supported, and the
file should be "finally" deleted when the descriptor is closed.

Even so, this is obscure enough that the response would be "don't do
that" except that the open-unlink-fstat behavior was encountered with
/usr/bin/vi, which does exactly that on its temporary files.  So, we
either fix it or retool everything not to use nullfs.

Does anyone know what the likely source of this different behavior is,
and whether it is feasible to address?  Or is NFS+NULLFS just pushing
the envelope a little too far?

This is on 7.2-RELEASE-p1 and 7.2-STABLE.

Thanks for any advice!

Test program source below:

#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <string.h>
#include <sys/stat.h>
#include <unistd.h>

int main() {
        int fd;
        struct stat st;

        fd = open("testfile", O_RDWR | O_CREAT | O_TRUNC, 0666);
        if (!fd) {
                fprintf(stderr, "open failed: %s\n",strerror(errno));
                return 10;
        }
        if (unlink("testfile")) {
                fprintf(stderr, "unlink failed: %s\n",strerror(errno));
                return 20;
        }
        if (fstat(fd, &st)) {
                fprintf(stderr, "fstat failed: %s\n",strerror(errno));
                return 30;
        }
        close(fd);
        return 0;
}
_______________________________________________
freebsd-hackers@... mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@..."

Re: FreeBSD 7.2 + NFS + nullfs + unlink + fstat = Stale NFS File Handle

by Andrey Simonenko :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, Oct 27, 2009 at 01:13:18PM -0400, Linda Messerschmidt wrote:
>
> Does anyone know what the likely source of this different behavior is,
> and whether it is feasible to address?  Or is NFS+NULLFS just pushing
> the envelope a little too far?

As I understand when a file is opened in NULLFS its vnode gets new
reference on 'count of users', but this new reference is not propagated
to the lower vnode (vnode that is under NULLFS).  When a file is removed
NULLFS passes this op to the lower FS (NFS in this example) and that
FS sees that its vnode has only a single reference on 'count of users'.

In case of NFS when there is a request to remove a vnode it checks that
value of 'count of users' for this vnode.  If this count is equal to 1,
then NFS client code does 'RPC remove'.  If this count is greater than 1
(for example when a file is opened), then NFS client code renames pathname
to .nfs-file, but does not send 'RPC remove' to the NFS server.

>
> fd = open("testfile", O_RDWR | O_CREAT | O_TRUNC, 0666);
> if (!fd) {
           ^^^^^ should be (fd < 0)
_______________________________________________
freebsd-hackers@... mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@..."

Re: FreeBSD 7.2 + NFS + nullfs + unlink + fstat = Stale NFS File Handle

by Linda Messerschmidt :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Oct 28, 2009 at 8:48 AM, Andrey Simonenko
<simon@...> wrote:

> As I understand when a file is opened in NULLFS its vnode gets new
> reference on 'count of users', but this new reference is not propagated
> to the lower vnode (vnode that is under NULLFS).  When a file is removed
> NULLFS passes this op to the lower FS (NFS in this example) and that
> FS sees that its vnode has only a single reference on 'count of users'.
>
> In case of NFS when there is a request to remove a vnode it checks that
> value of 'count of users' for this vnode.  If this count is equal to 1,
> then NFS client code does 'RPC remove'.  If this count is greater than 1
> (for example when a file is opened), then NFS client code renames pathname
> to .nfs-file, but does not send 'RPC remove' to the NFS server.

That sounds like a pretty reasonable explanation of what's going on.

Unfortunately it does sound like this would be tough to fix.  Since
NFS deletes are a special case, short of making an NFS-aware nullfs,
which seems silly, it sounds like the "solution" would be rewriting
nullfs to propagate the reference count.  I don't know enough about
nullfs to know exactly how hard that would be, but I'm guessing it's
not the work of a lazy afternoon. :-)

>>       if (!fd) {
>           ^^^^^ should be (fd < 0)

Oops, you are right!

Thanks very much!
_______________________________________________
freebsd-hackers@... mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@..."

Re: FreeBSD 7.2 + NFS + nullfs + unlink + fstat = Stale NFS File Handle

by Gleb Kurtsou-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On (28/10/2009 09:48), Linda Messerschmidt wrote:

> On Wed, Oct 28, 2009 at 8:48 AM, Andrey Simonenko
> <simon@...> wrote:
> > As I understand when a file is opened in NULLFS its vnode gets new
> > reference on 'count of users', but this new reference is not propagated
> > to the lower vnode (vnode that is under NULLFS).  When a file is removed
> > NULLFS passes this op to the lower FS (NFS in this example) and that
> > FS sees that its vnode has only a single reference on 'count of users'.
> >
> > In case of NFS when there is a request to remove a vnode it checks that
> > value of 'count of users' for this vnode.  If this count is equal to 1,
> > then NFS client code does 'RPC remove'.  If this count is greater than 1
> > (for example when a file is opened), then NFS client code renames pathname
> > to .nfs-file, but does not send 'RPC remove' to the NFS server.
>
> That sounds like a pretty reasonable explanation of what's going on.
>
> Unfortunately it does sound like this would be tough to fix.  Since
> NFS deletes are a special case, short of making an NFS-aware nullfs,
> which seems silly, it sounds like the "solution" would be rewriting
> nullfs to propagate the reference count.  I don't know enough about
> nullfs to know exactly how hard that would be, but I'm guessing it's
> not the work of a lazy afternoon. :-)
I think that's not about nullfs. Nullfs behaves correctly. It grabs
reference for a vnode and even tries too release it as soon as possible.
NFS logic is wrong here, imho. vrefcnt(vp) == 1 supposed to mean that
vnode is going to be reclaimed on next reference release and nothing
more. And it doesn't mean that reference is going to be released any
time soon. Although network filesystems in the tree abuse it the same
way NFS does.

Probably what NFS tries to do is to check if file descriptor is opened
for a vnode. This assumption holds only for a single code path out of
many: syscall by user. But there is plenty of code invoking vnode
operations without even allocating file descriptor.

Propagating per file descriptor reference counting into filesystem
itself is a layer violation and should be avoided in my opinion. There
should be a way to fix NFS by replacing vrefcnt check.

> >>       if (!fd) {
> >           ^^^^^ should be (fd < 0)
>
> Oops, you are right!
>
> Thanks very much!
> _______________________________________________
> freebsd-hackers@... mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@..."
_______________________________________________
freebsd-hackers@... mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@..."