|
View:
New views
20 Messages
—
Rating Filter:
Alert me
|
| < Prev | 1 - 2 - 3 | Next > |
|
|
NFSDear All,
I have been using a setup where my $HOME is located on a NetBSD NFS server (for 10+ years). Every now and then something goes wrong with NFS and all accesses to /home cause the process enter disk wait (D in ps listing). No matter how long I wait, the disk wait never completes. I can't even shutdown and reboot the client system properly because shutdown fails to unmount the NFS partition(s). At the same time, all other NFS mounts from the same server work just fine. Now that I started running FAM (hoping it would make gimp work better), the famd process couldn't be killed, so shutdown never even got to the point of unmounting disks. The only way to reboot the system was hitting reset - and now it takes 4 hours or so to recalculate raidframe parity. All I did was launch firefox3 and bang, /home was dead. Here's how I mount /home in case I'm using wrong options (I have tried many combinations with no luck): server:/home /home nfs rw,-X,-i,-b,-s,-C,-x16 0 0 So, what's up with NFS? Will it ever be fixed? It's been the same for years and for many NetBSD releases. Am I supposed to be running samba between NetBSD systems? -jm |
|
|
Re: NFSOn Tue, Jun 02, 2009 at 02:10:06PM +0300, Jukka Marin wrote:
> So, what's up with NFS? Will it ever be fixed? Is there a PR for your issue? Some things imediately spring to mind: - are you using IPF on the client? There was a fragment handling bug... - are you using amd(8)? There is something wrong in that area... - have you tried TCP instead of UDP mounts or vice versa? Some driver bugs apparently can be worked around this way - have you tried reducing write and read block size (like -r1024 -w1024)? This sometimes helps broken drivers. - what eterhnet driver are you using? Martin |
|
|
Re: NFSOn Tue, Jun 02, 2009 at 01:23:59PM +0200, Martin Husemann wrote:
> On Tue, Jun 02, 2009 at 02:10:06PM +0300, Jukka Marin wrote: > > So, what's up with NFS? Will it ever be fixed? Thanks for the reply. > Is there a PR for your issue? I haven't filed one, no. > Some things imediately spring to mind: > > - are you using IPF on the client? There was a fragment handling bug... No. > - are you using amd(8)? There is something wrong in that area... No. (I never figured out how to use it ;) > - have you tried TCP instead of UDP mounts or vice versa? Some driver > bugs apparently can be worked around this way I can try that. (I think I have tried that before, though..) > - have you tried reducing write and read block size (like -r1024 -w1024)? > This sometimes helps broken drivers. No. I can try that, too. (I'm using a local /home atm, though, to be able to get some work done..) > - what eterhnet driver are you using? I'm using ale now, but I have had similar problems with nfe and rtk (although not as soon as today with ale). -jm |
|
|
Re: NFSHi,
> (for 10+ years). Every now and then something goes wrong with NFS and > all accesses to /home cause the process enter disk wait (D in ps listing) |
|
|
|
|
|
Re: NFSOn Tue, Jun 02, 2009 at 07:06:50PM +0200, Michai Ramakers wrote:
> both client- and server-machines are NetBSD 4.0_STABLE; my mount > options happen to be rw,-T,-i,-l,-r=65536,-w=65536 I'm using -T now, so far so good (doesn't prove anything yet ;-) -jm |
|
|
Re: NFSAt Tue, 2 Jun 2009 21:15:29 +0300, Jukka Marin <jmarin@...> wrote:
Subject: Re: NFS > > On Tue, Jun 02, 2009 at 07:06:50PM +0200, Michai Ramakers wrote: > > both client- and server-machines are NetBSD 4.0_STABLE; my mount > > options happen to be rw,-T,-i,-l,-r=65536,-w=65536 > > I'm using -T now, so far so good (doesn't prove anything yet ;-) I've run NFS for various combinations of NetBSD machines for a long time now too, and so long as nothing goes wrong with the server (e.g. it needing a reboot for unrelated reasons) I've generally not had many problems. My systems have been either sparc, alpha, or i386 (and sun3 a very long time ago). While older releases (including 1.6.x) have generally had more networking problems overall, 4.x has been quite good. The only thing I really don't like about NetBSD NFS is the complete lack of client-side kernel file locking support, even though there's apparently been an implementation ready to go for several years now (locking can incite many new issues though, as I've discovered when trying to use it from a Mac OS X client -- because the Finder.app is way over-zealous with locking I've had to disable use of NFS locking when mounting my NetBSD-served home directory onto my Mac). > server:/home /home nfs rw,-X,-i,-b,-s,-C,-x16 0 0 My home directories are mounted with "-b,-i,rw,nodev,nosuid", though on my disk-less clients I use "rw,nosuid,nodev" and just "rw" for the root filesystem. I.e. I don't use the "-b" and especially not the "-i" options on critical filesystems, e.g. where executables live, etc. If I were you I would first get rid of "-X" -- it's not listed as stable and I've never even tried it. It may be a nice idea in theory, but.... I have considered using soft mounts ("-s") for some systems as well, especially the non-critical side of systems with cross-mounted partitions, but as yet I have not experimented with them, and it may even be that they don't work right either. Try without. You'll note that I don't use "-r" or "-w". As the manual page says you primarily only want to use larger values for UDP mount points when "netstat -s" is showing "fragments dropped after timeout" growing on the client and/or server. I've experimented briefly with TCP mounts, but on my local ethernet I've never found them to be necessary. IIRC, I did even have some problems with TCP mounts behaving more weirdly when systems had to be rebooted for unrelated reasons. BTW, I _never_ use AMD any more either -- besides just being generally buggy and fragile in my experience, it is completely antithetical to the way I prefer to administer shared filesystems. -- Greg A. Woods Planix, Inc. <woods@...> +1 416 218-0099 http://www.planix.com/ |
|
|
Re: NFSOn Tue, Jun 02, 2009 at 04:47:08PM +0300, Jukka Marin wrote:
> > - are you using amd(8)? There is something wrong in that area... > > No. (I never figured out how to use it ;) rm is the best way :-) FWIW, several of my machines have NFS-mounted homedirs and I've not seen a problem in quite a while. One likely difference is that in my case the server is a NetApp filer... what happens on the server when things go sour? -- David A. Holland dholland@... |
|
|
Re: NFSAt 16:14 Uhr -0400 02.06.2009, Greg A. Woods wrote:
>If I were you I would first get rid of "-X" -- it's not listed as stable >and I've never even tried it. It may be a nice idea in theory, but.... Huh? Are you sure you got that option right? Binaries that run in Linux emulation will fail when they access an NFS share otherwise, as I can confirm first-hand. hauke -- The ASCII Ribbon Campaign Hauke Fath () No HTML/RTF in email Institut für Nachrichtentechnik /\ No Word docs in email TU Darmstadt Respect for open standards Ruf +49-6151-16-3281 |
|
|
Re: NFSAt 6:50 Uhr +0000 03.06.2009, David Holland wrote:
>On Tue, Jun 02, 2009 at 04:47:08PM +0300, Jukka Marin wrote: > > > - are you using amd(8)? There is something wrong in that area... > > > > No. (I never figured out how to use it ;) > >rm is the best way :-) At 16:14 Uhr -0400 02.06.2009, Greg A. Woods wrote: >BTW, I _never_ use AMD any more either -- besides just being generally >buggy and fragile in my experience, Ahh, nothing like a good bit of FUD in the early morning. Amd just works here on > 50 machines, and has so for years. YMMV, obviously. hauke -- The ASCII Ribbon Campaign Hauke Fath () No HTML/RTF in email Institut für Nachrichtentechnik /\ No Word docs in email TU Darmstadt Respect for open standards Ruf +49-6151-16-3281 |
|
|
Re: NFSOn Wed, Jun 03, 2009 at 10:11:20AM +0200, Hauke Fath wrote:
> At 16:14 Uhr -0400 02.06.2009, Greg A. Woods wrote: > >If I were you I would first get rid of "-X" -- it's not listed as stable > >and I've never even tried it. It may be a nice idea in theory, but.... > > Huh? Are you sure you got that option right? Binaries that run in > Linux emulation will fail when they access an NFS share otherwise, as > I can confirm first-hand. Yep, that's why I'm using -X. -jm |
|
|
Re: NFSOn Wed, Jun 03, 2009 at 12:03:03PM +0300, Jukka Marin wrote:
> On Wed, Jun 03, 2009 at 10:11:20AM +0200, Hauke Fath wrote: > > At 16:14 Uhr -0400 02.06.2009, Greg A. Woods wrote: > > >If I were you I would first get rid of "-X" -- it's not listed as stable > > >and I've never even tried it. It may be a nice idea in theory, but.... > > > > Huh? Are you sure you got that option right? Binaries that run in > > Linux emulation will fail when they access an NFS share otherwise, as > > I can confirm first-hand. > > Yep, that's why I'm using -X. That option never caused a problem for me. My NetBSD NFS servers works very well for me, definitely much better than the Linux NFS server at work. The only problem I have a the moment is caused by amd(8) under high load (see PR bin/41259). That does however cause random failures to access files and not hangs. Kind regards -- Matthias Scheler http://zhadum.org.uk/ |
|
|
Re: NFSOn Wed, Jun 03, 2009 at 12:11:10PM +0100, Matthias Scheler wrote:
> > Yep, that's why I'm using -X. > > That option never caused a problem for me. > > My NetBSD NFS servers works very well for me, definitely much better > than the Linux NFS server at work. I don't think there's a problem with the server. I think it's the client that is having problems. In fact, no matter what happens to the server or the network cable, the switches or whatever, the NFS client should never become unusable. (Well, as long as /, /usr, /var and other essential things are on a local disk.) Sure, if the network dies, you will no longer be able to access the remote disks, but you should still be able to kill off the processes trying to do so. You should still be able to unmount the remote disk (you may lose some unwritten data, sure). And you should still be able to shutdown and reboot the client system. At the moment, if NFS dies (for whatever the reason), the client is pretty much useless, all processes even thinking of touching the NFS disk are dead, a graceful reboot is impossible, etc. I was using UDP with NFS because I thought it would get over a server reboot better than TCP. I'm not sure about this (my server has been quite reliable so far). I have had problems with clients when server goes down, but this was years ago. -jm |
|
|
Re: NFSOn Wed, Jun 03, 2009 at 05:32:09PM +0300, Jukka Marin wrote:
> On Wed, Jun 03, 2009 at 12:11:10PM +0100, Matthias Scheler wrote: > > > Yep, that's why I'm using -X. > > > > That option never caused a problem for me. > > > > My NetBSD NFS servers works very well for me, definitely much better > > than the Linux NFS server at work. > > I don't think there's a problem with the server. I think it's the client > that is having problems. In fact, no matter what happens to the server > or the network cable, the switches or whatever, the NFS client should > never become unusable. (Well, as long as /, /usr, /var and other > essential things are on a local disk.) > > Sure, if the network dies, you will no longer be able to access the remote > disks, but you should still be able to kill off the processes trying to > do so. Did you mount with '-o intr' ? If not, it's expected that a process trying to access a unaccessible NFS mount is unkillable. Also if your /home is on NFS it's expected that most users processes will hang on it. I found that a lot of things wants to acces $HOME (including the shells) for good or bad reasons ... -- Manuel Bouyer, LIP6, Universite Paris VI. Manuel.Bouyer@... NetBSD: 26 ans d'experience feront toujours la difference -- |
|
|
Re: NFSOn Wed, Jun 03, 2009 at 05:32:09PM +0300, Jukka Marin wrote:
> I was using UDP with NFS because I thought it would get over a server > reboot better than TCP. My experience is that TCP is actually better for detecting server reboots, at least between a NetBSD NFS server and a NetBSD NFS client. Kind regards -- Matthias Scheler http://zhadum.org.uk/ |
|
|
Re: NFSOn Wed, Jun 03, 2009 at 04:52:53PM +0200, Manuel Bouyer wrote:
> Did you mount with '-o intr' ? If not, it's expected that a process > trying to access a unaccessible NFS mount is unkillable. I'm using -i but it doesn't seem to help (too much). > Also if your /home is on NFS it's expected that most users processes > will hang on it. I found that a lot of things wants to acces $HOME (including > the shells) for good or bad reasons ... I have noticed... :-( -jm |
|
|
Re: NFSOn Wed, Jun 03, 2009 at 07:02:12PM +0300, Jukka Marin wrote:
> On Wed, Jun 03, 2009 at 04:52:53PM +0200, Manuel Bouyer wrote: > > Did you mount with '-o intr' ? If not, it's expected that a process > > trying to access a unaccessible NFS mount is unkillable. > > I'm using -i but it doesn't seem to help (too much). I personally don't think that this is a good idea. I e.g. don't want an editor to get an error if the NFS server is unreachable and throw away an hours worth of changes as result. The "hanging NFS mount" feature was designed so that applications don't have to deal with such problems. And most of them don't. Kind regards -- Matthias Scheler http://zhadum.org.uk/ |
|
|
Re: NFSOn Wed, Jun 03, 2009 at 05:16:18PM +0100, Matthias Scheler wrote:
> On Wed, Jun 03, 2009 at 07:02:12PM +0300, Jukka Marin wrote: > > On Wed, Jun 03, 2009 at 04:52:53PM +0200, Manuel Bouyer wrote: > > > Did you mount with '-o intr' ? If not, it's expected that a process > > > trying to access a unaccessible NFS mount is unkillable. > > > > I'm using -i but it doesn't seem to help (too much). > > I personally don't think that this is a good idea. I e.g. don't want an > editor to get an error if the NFS server is unreachable and throw away > an hours worth of changes as result. It's not the same as soft mounts. With -i you don't get an error if the NFS server is unreachable, but the process is still killable (or course you loose your editor's buffer, just as if you kill -9 it while working on a local file ...) -- Manuel Bouyer, LIP6, Universite Paris VI. Manuel.Bouyer@... NetBSD: 26 ans d'experience feront toujours la difference -- |
|
|
Re: NFSAt Wed, 3 Jun 2009 10:11:20 +0200, Hauke Fath <hf@...> wrote:
Subject: Re: NFS > > At 16:14 Uhr -0400 02.06.2009, Greg A. Woods wrote: > >If I were you I would first get rid of "-X" -- it's not listed as stable > >and I've never even tried it. It may be a nice idea in theory, but.... > > Huh? Are you sure you got that option right? Oops, sorry, yes I got something mixed up. In fact I must have accidentally read a completely different manual page at the wrong moment because I can't even find anything related in mount_nfs(8) now. -- Greg A. Woods Planix, Inc. <woods@...> +1 416 218-0099 http://www.planix.com/ |
|
|
Re: NFS & AMDAt Wed, 3 Jun 2009 10:18:03 +0200, Hauke Fath <hf@...> wrote:
Subject: Re: NFS > > At 6:50 Uhr +0000 03.06.2009, David Holland wrote: > >On Tue, Jun 02, 2009 at 04:47:08PM +0300, Jukka Marin wrote: > > > > - are you using amd(8)? There is something wrong in that area... > > > > > > No. (I never figured out how to use it ;) > > > >rm is the best way :-) > > At 16:14 Uhr -0400 02.06.2009, Greg A. Woods wrote: > >BTW, I _never_ use AMD any more either -- besides just being generally > >buggy and fragile in my experience, > > Ahh, nothing like a good bit of FUD in the early morning. > > Amd just works here on > 50 machines, and has so for years. YMMV, obviously. actually provide some of the benefits its proponents claim for it. I have an even poorer understanding of how it can interact badly with various situations and cause problems. However in particular I find it impossible to believe that it will make NFS appear to be more reliable for even one or a very few mounts on one client host. More often than not the problems were caused by AMD, and going to fixed NFS mounts solved those problems. Sure it can refuse to mount filesystems from servers that are not healthy and it can unmount unused filesystems after some time, but that doesn't really help with actual reliability. It can also apparently use replicated servers dynamically, but that's not part of the common claim about better reliability for just one mount from one server. It's a large amount of quite complex code that I feel is best to avoid using if possible. I'll also admit though that my past experiences with AMD were not with NetBSD clients (or servers). Perhaps some of my bad experiences with AMD were due more to mis-configuration -- after all it is commonly said that AMD can be very difficult to configure correctly. All the times I have encountered AMD, I was not responsible for its use or operation. Some problems I remember, such as it unmounting filesystems which actually were in use don't sound like they could be caused by mis-configuration though. Further I'll admit I've never built any network with a complex enough mess of file servers to ever face any problems with manually maintaining client system fstab entries. Even with very large networks I've been able to keep the number of NFS servers and mounts to a minimum. I also know to avoid using NSF on mobile clients. :-) I'd also say too that there are far simpler ways to deal with automatic mounting of things like removable media on workstations and the like. My biggest problem with NFS without AMD are the times when I mount filesystems from non-production machines and then don't remember to unmount them when the machine is down. Even then though the biggest problems are with operations that _should_ be interruptable (when the mount was done with "-i"), but which are not, most notably "df". Hmmm.... doesn't seem like there's a PR open for that problem. -- Greg A. Woods Planix, Inc. <woods@...> +1 416 218-0099 http://www.planix.com/ |
| < Prev | 1 - 2 - 3 | Next > |
| Free embeddable forum powered by Nabble | Forum Help |