NFS

View: New views
20 Messages — Rating Filter:   Alert me  
< Prev | 1 - 2 - 3 | Next >

NFS

by Jukka Marin :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Dear All,

I have been using a setup where my $HOME is located on a NetBSD NFS server
(for 10+ years).  Every now and then something goes wrong with NFS and
all accesses to /home cause the process enter disk wait (D in ps listing).
No matter how long I wait, the disk wait never completes.  I can't even
shutdown and reboot the client system properly because shutdown fails
to unmount the NFS partition(s).  At the same time, all other NFS mounts
from the same server work just fine.

Now that I started running FAM (hoping it would make gimp work better),
the famd process couldn't be killed, so shutdown never even got to the
point of unmounting disks.

The only way to reboot the system was hitting reset - and now it takes
4 hours or so to recalculate raidframe parity.

All I did was launch firefox3 and bang, /home was dead.

Here's how I mount /home in case I'm using wrong options (I have tried
many combinations with no luck):

server:/home    /home   nfs     rw,-X,-i,-b,-s,-C,-x16  0 0

So, what's up with NFS?  Will it ever be fixed?  It's been the same
for years and for many NetBSD releases.  Am I supposed to be running
samba between NetBSD systems?

  -jm

Re: NFS

by Martin Husemann :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, Jun 02, 2009 at 02:10:06PM +0300, Jukka Marin wrote:
> So, what's up with NFS?  Will it ever be fixed?

Is there a PR for your issue?
Some things imediately spring to mind:

 - are you using IPF on the client? There was a fragment handling bug...
 - are you using amd(8)? There is something wrong in that area...
 - have you tried TCP instead of UDP mounts or vice versa? Some driver
   bugs apparently can be worked around this way
 - have you tried reducing write and read block size (like -r1024 -w1024)?
   This sometimes helps broken drivers.
 - what eterhnet driver are you using?

Martin

Re: NFS

by Jukka Marin :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, Jun 02, 2009 at 01:23:59PM +0200, Martin Husemann wrote:
> On Tue, Jun 02, 2009 at 02:10:06PM +0300, Jukka Marin wrote:
> > So, what's up with NFS?  Will it ever be fixed?

Thanks for the reply.

> Is there a PR for your issue?

I haven't filed one, no.

> Some things imediately spring to mind:
>
>  - are you using IPF on the client? There was a fragment handling bug...

No.

>  - are you using amd(8)? There is something wrong in that area...

No.  (I never figured out how to use it ;)

>  - have you tried TCP instead of UDP mounts or vice versa? Some driver
>    bugs apparently can be worked around this way

I can try that.  (I think I have tried that before, though..)

>  - have you tried reducing write and read block size (like -r1024 -w1024)?
>    This sometimes helps broken drivers.

No.  I can try that, too.  (I'm using a local /home atm, though, to
be able to get some work done..)

>  - what eterhnet driver are you using?

I'm using ale now, but I have had similar problems with nfe and rtk
(although not as soon as today with ale).

  -jm

Re: NFS

by m.ramakers :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

> (for 10+ years).  Every now and then something goes wrong with NFS and
> all accesses to /home cause the process enter disk wait (D in ps listing)

Parent Message unknown Re: NFS

by m.ramakers :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

> > using TCP-mounts solved it
> > instantly. If you want I could paste my mount-options.
>
> Yes, I'm intererested in seeing your mount options.  It's annoying to
> have your desktop system become unusable in the middle of a busy day..
> (In fact, I hate having to reboot systems at all times ;)

both client- and server-machines are NetBSD 4.0_STABLE; my mount
options happen to be   rw,-T,-i,-l,-r=65536,-w=65536

good luck,
Michai

Re: NFS

by Jukka Marin :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, Jun 02, 2009 at 07:06:50PM +0200, Michai Ramakers wrote:
> both client- and server-machines are NetBSD 4.0_STABLE; my mount
> options happen to be   rw,-T,-i,-l,-r=65536,-w=65536

I'm using -T now, so far so good (doesn't prove anything yet ;-)

  -jm

Re: NFS

by Greg A. Woods-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

At Tue, 2 Jun 2009 21:15:29 +0300, Jukka Marin <jmarin@...> wrote:
Subject: Re: NFS
>
> On Tue, Jun 02, 2009 at 07:06:50PM +0200, Michai Ramakers wrote:
> > both client- and server-machines are NetBSD 4.0_STABLE; my mount
> > options happen to be   rw,-T,-i,-l,-r=65536,-w=65536
>
> I'm using -T now, so far so good (doesn't prove anything yet ;-)

I've run NFS for various combinations of NetBSD machines for a long time
now too, and so long as nothing goes wrong with the server (e.g. it
needing a reboot for unrelated reasons) I've generally not had many
problems.  My systems have been either sparc, alpha, or i386 (and sun3 a
very long time ago).  While older releases (including 1.6.x) have
generally had more networking problems overall, 4.x has been quite good.
The only thing I really don't like about NetBSD NFS is the complete lack
of client-side kernel file locking support, even though there's
apparently been an implementation ready to go for several years now
(locking can incite many new issues though, as I've discovered when
trying to use it from a Mac OS X client -- because the Finder.app is way
over-zealous with locking I've had to disable use of NFS locking when
mounting my NetBSD-served home directory onto my Mac).

> server:/home    /home   nfs     rw,-X,-i,-b,-s,-C,-x16  0 0

My home directories are mounted with "-b,-i,rw,nodev,nosuid", though on
my disk-less clients I use "rw,nosuid,nodev" and just "rw" for the root
filesystem.  I.e. I don't use the "-b" and especially not the "-i"
options on critical filesystems, e.g. where executables live, etc.

If I were you I would first get rid of "-X" -- it's not listed as stable
and I've never even tried it.  It may be a nice idea in theory, but....

I have considered using soft mounts ("-s") for some systems as well,
especially the non-critical side of systems with cross-mounted
partitions, but as yet I have not experimented with them, and it may
even be that they don't work right either.  Try without.

You'll note that I don't use "-r" or "-w".  As the manual page says you
primarily only want to use larger values for UDP mount points when
"netstat -s" is showing "fragments dropped after timeout" growing on the
client and/or server.

I've experimented briefly with TCP mounts, but on my local ethernet I've
never found them to be necessary.  IIRC, I did even have some problems
with TCP mounts behaving more weirdly when systems had to be rebooted
for unrelated reasons.

BTW, I _never_ use AMD any more either -- besides just being generally
buggy and fragile in my experience, it is completely antithetical to the
way I prefer to administer shared filesystems.

--
                                                Greg A. Woods
                                                Planix, Inc.

<woods@...>       +1 416 218-0099        http://www.planix.com/


attachment0 (193 bytes) Download Attachment

Re: NFS

by David Holland-6 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, Jun 02, 2009 at 04:47:08PM +0300, Jukka Marin wrote:
 > >  - are you using amd(8)? There is something wrong in that area...
 >
 > No.  (I never figured out how to use it ;)

rm is the best way :-)

FWIW, several of my machines have NFS-mounted homedirs and I've not
seen a problem in quite a while. One likely difference is that in my
case the server is a NetApp filer... what happens on the server when
things go sour?

--
David A. Holland
dholland@...

Re: NFS

by Hauke Fath-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

At 16:14 Uhr -0400 02.06.2009, Greg A. Woods wrote:
>If I were you I would first get rid of "-X" -- it's not listed as stable
>and I've never even tried it.  It may be a nice idea in theory, but....

Huh? Are you sure you got that option right? Binaries that run in
Linux emulation will fail when they access an NFS share otherwise, as
I can confirm first-hand.

        hauke

--
      The ASCII Ribbon Campaign                    Hauke Fath
()     No HTML/RTF in email            Institut für Nachrichtentechnik
/\     No Word docs in email                     TU Darmstadt
      Respect for open standards              Ruf +49-6151-16-3281

Re: NFS

by Hauke Fath-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

At 6:50 Uhr +0000 03.06.2009, David Holland wrote:
>On Tue, Jun 02, 2009 at 04:47:08PM +0300, Jukka Marin wrote:
>  > >  - are you using amd(8)? There is something wrong in that area...
>  >
>  > No.  (I never figured out how to use it ;)
>
>rm is the best way :-)

At 16:14 Uhr -0400 02.06.2009, Greg A. Woods wrote:
>BTW, I _never_ use AMD any more either -- besides just being generally
>buggy and fragile in my experience,

Ahh, nothing like a good bit of FUD in the early morning.

Amd just works here on > 50 machines, and has so for years. YMMV, obviously.

        hauke

--
      The ASCII Ribbon Campaign                    Hauke Fath
()     No HTML/RTF in email            Institut für Nachrichtentechnik
/\     No Word docs in email                     TU Darmstadt
      Respect for open standards              Ruf +49-6151-16-3281

Re: NFS

by Jukka Marin :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Jun 03, 2009 at 10:11:20AM +0200, Hauke Fath wrote:
> At 16:14 Uhr -0400 02.06.2009, Greg A. Woods wrote:
> >If I were you I would first get rid of "-X" -- it's not listed as stable
> >and I've never even tried it.  It may be a nice idea in theory, but....
>
> Huh? Are you sure you got that option right? Binaries that run in
> Linux emulation will fail when they access an NFS share otherwise, as
> I can confirm first-hand.

Yep, that's why I'm using -X.

  -jm

Re: NFS

by Matthias Scheler-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Jun 03, 2009 at 12:03:03PM +0300, Jukka Marin wrote:

> On Wed, Jun 03, 2009 at 10:11:20AM +0200, Hauke Fath wrote:
> > At 16:14 Uhr -0400 02.06.2009, Greg A. Woods wrote:
> > >If I were you I would first get rid of "-X" -- it's not listed as stable
> > >and I've never even tried it.  It may be a nice idea in theory, but....
> >
> > Huh? Are you sure you got that option right? Binaries that run in
> > Linux emulation will fail when they access an NFS share otherwise, as
> > I can confirm first-hand.
>
> Yep, that's why I'm using -X.

That option never caused a problem for me.

My NetBSD NFS servers works very well for me, definitely much better
than the Linux NFS server at work.

The only problem I have a the moment is caused by amd(8) under high
load (see PR bin/41259). That does however cause random failures
to access files and not hangs.

        Kind regards

--
Matthias Scheler                                  http://zhadum.org.uk/

Re: NFS

by Jukka Marin :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Jun 03, 2009 at 12:11:10PM +0100, Matthias Scheler wrote:
> > Yep, that's why I'm using -X.
>
> That option never caused a problem for me.
>
> My NetBSD NFS servers works very well for me, definitely much better
> than the Linux NFS server at work.

I don't think there's a problem with the server.  I think it's the client
that is having problems.  In fact, no matter what happens to the server
or the network cable, the switches or whatever, the NFS client should
never become unusable.  (Well, as long as /, /usr, /var and other
essential things are on a local disk.)

Sure, if the network dies, you will no longer be able to access the remote
disks, but you should still be able to kill off the processes trying to
do so.  You should still be able to unmount the remote disk (you may lose
some unwritten data, sure).  And you should still be able to shutdown and
reboot the client system.

At the moment, if NFS dies (for whatever the reason), the client is
pretty much useless, all processes even thinking of touching the NFS
disk are dead, a graceful reboot is impossible, etc.

I was using UDP with NFS because I thought it would get over a server
reboot better than TCP.  I'm not sure about this (my server has been
quite reliable so far).  I have had problems with clients when server
goes down, but this was years ago.

  -jm

Re: NFS

by Manuel Bouyer :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Jun 03, 2009 at 05:32:09PM +0300, Jukka Marin wrote:

> On Wed, Jun 03, 2009 at 12:11:10PM +0100, Matthias Scheler wrote:
> > > Yep, that's why I'm using -X.
> >
> > That option never caused a problem for me.
> >
> > My NetBSD NFS servers works very well for me, definitely much better
> > than the Linux NFS server at work.
>
> I don't think there's a problem with the server.  I think it's the client
> that is having problems.  In fact, no matter what happens to the server
> or the network cable, the switches or whatever, the NFS client should
> never become unusable.  (Well, as long as /, /usr, /var and other
> essential things are on a local disk.)
>
> Sure, if the network dies, you will no longer be able to access the remote
> disks, but you should still be able to kill off the processes trying to
> do so.


Did you mount with '-o intr' ? If not, it's expected that a process
trying to access a unaccessible NFS mount is unkillable.

Also if your /home is on NFS it's expected that most users processes
will hang on it. I found that a lot of things wants to acces $HOME (including
the shells) for good or bad reasons ...

--
Manuel Bouyer, LIP6, Universite Paris VI.           Manuel.Bouyer@...
     NetBSD: 26 ans d'experience feront toujours la difference
--

Re: NFS

by Matthias Scheler-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Jun 03, 2009 at 05:32:09PM +0300, Jukka Marin wrote:
> I was using UDP with NFS because I thought it would get over a server
> reboot better than TCP.

My experience is that TCP is actually better for detecting server reboots,
at least between a NetBSD NFS server and a NetBSD NFS client.

        Kind regards

--
Matthias Scheler                                  http://zhadum.org.uk/

Re: NFS

by Jukka Marin :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Jun 03, 2009 at 04:52:53PM +0200, Manuel Bouyer wrote:
> Did you mount with '-o intr' ? If not, it's expected that a process
> trying to access a unaccessible NFS mount is unkillable.

I'm using -i but it doesn't seem to help (too much).

> Also if your /home is on NFS it's expected that most users processes
> will hang on it. I found that a lot of things wants to acces $HOME (including
> the shells) for good or bad reasons ...

I have noticed... :-(

  -jm

Re: NFS

by Matthias Scheler-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Jun 03, 2009 at 07:02:12PM +0300, Jukka Marin wrote:
> On Wed, Jun 03, 2009 at 04:52:53PM +0200, Manuel Bouyer wrote:
> > Did you mount with '-o intr' ? If not, it's expected that a process
> > trying to access a unaccessible NFS mount is unkillable.
>
> I'm using -i but it doesn't seem to help (too much).

I personally don't think that this is a good idea. I e.g. don't want an
editor to get an error if the NFS server is unreachable and throw away
an hours worth of changes as result.

The "hanging NFS mount" feature was designed so that applications
don't have to deal with such problems. And most of them don't.

        Kind regards

--
Matthias Scheler                                  http://zhadum.org.uk/

Re: NFS

by Manuel Bouyer :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Jun 03, 2009 at 05:16:18PM +0100, Matthias Scheler wrote:

> On Wed, Jun 03, 2009 at 07:02:12PM +0300, Jukka Marin wrote:
> > On Wed, Jun 03, 2009 at 04:52:53PM +0200, Manuel Bouyer wrote:
> > > Did you mount with '-o intr' ? If not, it's expected that a process
> > > trying to access a unaccessible NFS mount is unkillable.
> >
> > I'm using -i but it doesn't seem to help (too much).
>
> I personally don't think that this is a good idea. I e.g. don't want an
> editor to get an error if the NFS server is unreachable and throw away
> an hours worth of changes as result.

It's not the same as soft mounts. With -i you don't get an error
if the NFS server is unreachable, but the process is still killable
(or course you loose your editor's buffer, just as if you kill -9 it while
working on a local file ...)

--
Manuel Bouyer, LIP6, Universite Paris VI.           Manuel.Bouyer@...
     NetBSD: 26 ans d'experience feront toujours la difference
--

Re: NFS

by Greg A. Woods-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

At Wed, 3 Jun 2009 10:11:20 +0200, Hauke Fath <hf@...> wrote:
Subject: Re: NFS
>
> At 16:14 Uhr -0400 02.06.2009, Greg A. Woods wrote:
> >If I were you I would first get rid of "-X" -- it's not listed as stable
> >and I've never even tried it.  It may be a nice idea in theory, but....
>
> Huh? Are you sure you got that option right?

Oops, sorry, yes I got something mixed up.  In fact I must have
accidentally read a completely different manual page at the wrong moment
because I can't even find anything related in mount_nfs(8) now.

--
                                                Greg A. Woods
                                                Planix, Inc.

<woods@...>       +1 416 218-0099        http://www.planix.com/


attachment0 (193 bytes) Download Attachment

Re: NFS & AMD

by Greg A. Woods-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

At Wed, 3 Jun 2009 10:18:03 +0200, Hauke Fath <hf@...> wrote:
Subject: Re: NFS

>
> At 6:50 Uhr +0000 03.06.2009, David Holland wrote:
> >On Tue, Jun 02, 2009 at 04:47:08PM +0300, Jukka Marin wrote:
> >  > >  - are you using amd(8)? There is something wrong in that area...
> >  >
> >  > No.  (I never figured out how to use it ;)
> >
> >rm is the best way :-)
>
> At 16:14 Uhr -0400 02.06.2009, Greg A. Woods wrote:
> >BTW, I _never_ use AMD any more either -- besides just being generally
> >buggy and fragile in my experience,
>
> Ahh, nothing like a good bit of FUD in the early morning.
>
> Amd just works here on > 50 machines, and has so for years. YMMV, obviously.
I'll admit to having a relatively poor understanding of where AMD can
actually provide some of the benefits its proponents claim for it.  I
have an even poorer understanding of how it can interact badly with
various situations and cause problems.

However in particular I find it impossible to believe that it will make
NFS appear to be more reliable for even one or a very few mounts on one
client host.  More often than not the problems were caused by AMD, and
going to fixed NFS mounts solved those problems.  Sure it can refuse to
mount filesystems from servers that are not healthy and it can unmount
unused filesystems after some time, but that doesn't really help with
actual reliability.  It can also apparently use replicated servers
dynamically, but that's not part of the common claim about better
reliability for just one mount from one server.  It's a large amount of
quite complex code that I feel is best to avoid using if possible.

I'll also admit though that my past experiences with AMD were not with
NetBSD clients (or servers).  Perhaps some of my bad experiences with
AMD were due more to mis-configuration -- after all it is commonly said
that AMD can be very difficult to configure correctly.  All the times I
have encountered AMD, I was not responsible for its use or operation.
Some problems I remember, such as it unmounting filesystems which
actually were in use don't sound like they could be caused by
mis-configuration though.

Further I'll admit I've never built any network with a complex enough
mess of file servers to ever face any problems with manually maintaining
client system fstab entries.  Even with very large networks I've been
able to keep the number of NFS servers and mounts to a minimum.  I also
know to avoid using NSF on mobile clients.  :-)  I'd also say too that
there are far simpler ways to deal with automatic mounting of things
like removable media on workstations and the like.

My biggest problem with NFS without AMD are the times when I mount
filesystems from non-production machines and then don't remember to
unmount them when the machine is down.  Even then though the biggest
problems are with operations that _should_ be interruptable (when the
mount was done with "-i"), but which are not, most notably "df".
Hmmm.... doesn't seem like there's a PR open for that problem.

--
                                                Greg A. Woods
                                                Planix, Inc.

<woods@...>       +1 416 218-0099        http://www.planix.com/


attachment0 (193 bytes) Download Attachment
< Prev | 1 - 2 - 3 | Next >