Odd behavior from amd

View: New views
5 Messages — Rating Filter:   Alert me  

Odd behavior from amd

by Lars Friend :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Hello all,

        I've got a strange problem that has been plaguing me for a while,
and I have spent a good portion of today trying to get to the bottom of.

        We have a cluster of several machines (NetBSD 3.1 i386 running
release 3.1 GENERIC kernels), each performing different functions,
and we have our home directories mounted via nfs (exported split
between a couple of the servers) by running amd with the same maps on
each server.  This allows a user to log in anywhere and have their
same home directory no matter which machine they are on.  This is
generally very handy, and it works with remarkable stability, until I
go and move a [not logged in] user's home directory from one server
to another (for disk space management reasons, for instance).

        The problem boils down to this:  Every once in a while when I update
the amd maps, amd will catch the change quickly enough, and amq will
reflect the correct change, but the directory where the symlinks live
(which amd implements as a read-only local NFS system which we mount
on /home) will still have a symlink pointing to the old mapping location.

        For instance, if I have server_a, server_b, and server_c, and
server_d where the home directories are mapped like this (on all four hosts):

bob host!=server_a;rhost:=server_a;rfs:=/data/export/home \
        host==server_a;fs:=/data/export/home;type:=link
joe host!=server_b;rhost:=server_b;rfs:=/data/export/home \
        host==server_b;fs:=/data/export/home;type:=link
dave host!=server_b;rhost:=server_b;rfs:=/data/export/home \
        host==server_b;fs:=/data/export/home;type:=link

and I change that map to: (after copying joe's home directory from
server b to server a)

bob host!=server_a;rhost:=server_a;rfs:=/data/export/home \
        host==server_a;fs:=/data/export/home;type:=link
joe host!=server_a;rhost:=server_a;rfs:=/data/export/home \
        host==server_a;fs:=/data/export/home;type:=link
dave host!=server_b;rhost:=server_b;rfs:=/data/export/home \
        host==server_b;fs:=/data/export/home;type:=link

After waiting for the map_reload_interval to expire, all four servers
(a,b,c,d) will (when asked with amq) tell me that the mapping has
been picked up and honored, but if I go and look in /home, sometimes
(but not always), the old symlink will still be there:

lrwxrwxrwx  1 root  wheel  21 Feb  8 15:33 /home/joe ->
/.automount/server_b/data/export/home/joe

        This seems to occur more on servers that have lots of activity in
the other amd mapped home directories, but not reliably on any given
host.  An amq -f will not clear this condition, and it seems that the
only way to get rid of this is to stop and restart amd (which I
obviously don't like to do with 50+ users logged in).

Does anybody have any suggestions?

        Thanks Much,

                -lars


Re: Odd behavior from amd

by Christos Zoulas-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

In article <200802082042.m18KgVxR003362@...>,
Lars Friend  <lfriend@...> wrote:

>
>Hello all,
>
> I've got a strange problem that has been plaguing me for a while,
>and I have spent a good portion of today trying to get to the bottom of.
>
> We have a cluster of several machines (NetBSD 3.1 i386 running
>release 3.1 GENERIC kernels), each performing different functions,
>and we have our home directories mounted via nfs (exported split
>between a couple of the servers) by running amd with the same maps on
>each server.  This allows a user to log in anywhere and have their
>same home directory no matter which machine they are on.  This is
>generally very handy, and it works with remarkable stability, until I
>go and move a [not logged in] user's home directory from one server
>to another (for disk space management reasons, for instance).
>
> The problem boils down to this:  Every once in a while when I update
>the amd maps, amd will catch the change quickly enough, and amq will
>reflect the correct change, but the directory where the symlinks live
>(which amd implements as a read-only local NFS system which we mount
>on /home) will still have a symlink pointing to the old mapping location.
>
> For instance, if I have server_a, server_b, and server_c, and
>server_d where the home directories are mapped like this (on all four hosts):
>
>bob host!=server_a;rhost:=server_a;rfs:=/data/export/home \
> host==server_a;fs:=/data/export/home;type:=link
>joe host!=server_b;rhost:=server_b;rfs:=/data/export/home \
> host==server_b;fs:=/data/export/home;type:=link
>dave host!=server_b;rhost:=server_b;rfs:=/data/export/home \
> host==server_b;fs:=/data/export/home;type:=link
>
>and I change that map to: (after copying joe's home directory from
>server b to server a)
>
>bob host!=server_a;rhost:=server_a;rfs:=/data/export/home \
> host==server_a;fs:=/data/export/home;type:=link
>joe host!=server_a;rhost:=server_a;rfs:=/data/export/home \
> host==server_a;fs:=/data/export/home;type:=link
>dave host!=server_b;rhost:=server_b;rfs:=/data/export/home \
> host==server_b;fs:=/data/export/home;type:=link
>
>After waiting for the map_reload_interval to expire, all four servers
>(a,b,c,d) will (when asked with amq) tell me that the mapping has
>been picked up and honored, but if I go and look in /home, sometimes
>(but not always), the old symlink will still be there:
>
>lrwxrwxrwx  1 root  wheel  21 Feb  8 15:33 /home/joe ->
>/.automount/server_b/data/export/home/joe
>
> This seems to occur more on servers that have lots of activity in
>the other amd mapped home directories, but not reliably on any given
>host.  An amq -f will not clear this condition, and it seems that the
>only way to get rid of this is to stop and restart amd (which I
>obviously don't like to do with 50+ users logged in).
>
>Does anybody have any suggestions?

Use 'amq -u' to unmount the offending partition before 'amq -f'.

christos


Re: Odd behavior from amd

by Lars Friend :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

At 03:53 PM 2/8/2008, Christos Zoulas wrote:

(snip)

> >Does anybody have any suggestions?
>
>Use 'amq -u' to unmount the offending partition before 'amq -f'.

I just tried this, and the symlink is still wrong.  However if I ask
amq, it says all is well.

Thanks thought,

         -lars


>christos


Re: Odd behavior from amd

by Christos Zoulas :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Feb 8,  4:03pm, lfriend@... (Lars Friend) wrote:
-- Subject: Re: Odd behavior from amd

| At 03:53 PM 2/8/2008, Christos Zoulas wrote:
|
| (snip)
|
| > >Does anybody have any suggestions?
| >
| >Use 'amq -u' to unmount the offending partition before 'amq -f'.
|
| I just tried this, and the symlink is still wrong.  However if I ask
| amq, it says all is well.
|
| Thanks thought,
|

What kind of maps are you using? Did you verify that after amq -u the
symlink was gone?

christos

Re: Odd behavior from amd

by Lars Friend :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

The amq -u does not remove the symlinks.

We are using nfs maps with symlinks (like this):

juser           host!=homehost;rhost:=homehost;rfs:=/wd1e/export/home \
                 host==homehost;fs:=/wd1e/export/home;type:=link

As I understand it, amd consists of two different logical parts:

1) A server that hooks file system access, looks up the mountpoint,
and mounts it. (in /amd/...)

2) A userland NFS server that serves out a file system full of
symlinks to the mountpoints managed by the above process (in /home).

It seems that part 1 is working,  such that if I move a home
directory, and then go crawling around in
/amd/homehost/wd1e/export/home/..., or ask amq, the correct
information will be picked up as expected, and the new filesystem
will be mounted.  The problem is that part 2 never seems to get the
message.  The galling thing is that it is _so_ close, the automounter
mounts the new file system just fine, so it unmounts
/amd/oldhost/wd1e/export/home/juser and correctly mounts the new tree
at /amd/homehost/wd1e/export/home/juser, but the symlink in /home
still (until amd is restarted) points to
/amd/oldhost/wd1e/export/home/juser which of course no longer exists...

I hope that clears things up.  That being said, it is only minimally
disruptive to:

sudo /etc/rc.d/amd stop

sudo /etc/rc.d/amd start

but it just doesn't feel right to have to do that.

         -lars

At 05:11 PM 2/8/2008, Christos Zoulas wrote:

>On Feb 8,  4:03pm, lfriend@... (Lars Friend) wrote:
>-- Subject: Re: Odd behavior from amd
>
>| At 03:53 PM 2/8/2008, Christos Zoulas wrote:
>|
>| (snip)
>|
>| > >Does anybody have any suggestions?
>| >
>| >Use 'amq -u' to unmount the offending partition before 'amq -f'.
>|
>| I just tried this, and the symlink is still wrong.  However if I ask
>| amq, it says all is well.
>|
>| Thanks thought,
>|
>
>What kind of maps are you using? Did you verify that after amq -u the
>symlink was gone?
>
>christos