disconnected nss_ldap

View: New views
20 Messages — Rating Filter:   Alert me  
< Prev | 1 - 2 | Next >

disconnected nss_ldap

by Brian J. Murrell :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

At the risk of asking a FAQ (but in my defence, I have been googling
this off and on for the last 2-3 weeks) how does one properly handle
computers (i.e. laptops) that should get their NSS information from LDAP
while connected to the corporate network and yet still function while
away from the corporate network?

pam_ccreds handles the authentication (be it ldap or kerberos) caching
but general NSS lookups while the LDAP server is unavailable makes just
about everything just about useless.

I realize that caching is what is needed here and I have looked into
nscd for this, using it's persistent storage feature, but it just
doesn't seem to be thought out well enough from the temporarily
disconnected use-case.  It seems that nscd needs two timeouts.  One at
which it will try to refresh a stale entry and a second at which it will
expire a stale entry.  Reasonable times for the two would be something
on the order of 10 minutes and 30 days, respectively.

Surely others have run into this same problem.  How did you solve it?

BTW: I am aware of nss_updatedb, but that seems a little clunky and
heavy handed with it's "cache everything" and rigid (i.e. time of day
driven) update schedule.  For such reasons I have read frequently that
it really just doesn't scale.  An nss_updatedb that is updated as a
result of usual lookups seems much more manageable.  That way only
information the user is likely to use is cached and it's done with the
frequency of and as a by-product of existing lookups.

Thots?

b.



signature.asc (204 bytes) Download Attachment

Re: disconnected nss_ldap

by Howard Chu :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Brian J. Murrell wrote:

> At the risk of asking a FAQ (but in my defence, I have been googling
> this off and on for the last 2-3 weeks) how does one properly handle
> computers (i.e. laptops) that should get their NSS information from LDAP
> while connected to the corporate network and yet still function while
> away from the corporate network?
>
> pam_ccreds handles the authentication (be it ldap or kerberos) caching
> but general NSS lookups while the LDAP server is unavailable makes just
> about everything just about useless.
>
> I realize that caching is what is needed here and I have looked into
> nscd for this, using it's persistent storage feature, but it just
> doesn't seem to be thought out well enough from the temporarily
> disconnected use-case.  It seems that nscd needs two timeouts.  One at
> which it will try to refresh a stale entry and a second at which it will
> expire a stale entry.  Reasonable times for the two would be something
> on the order of 10 minutes and 30 days, respectively.
>
> Surely others have run into this same problem.  How did you solve it?
>
> BTW: I am aware of nss_updatedb, but that seems a little clunky and
> heavy handed with it's "cache everything" and rigid (i.e. time of day
> driven) update schedule.  For such reasons I have read frequently that
> it really just doesn't scale.  An nss_updatedb that is updated as a
> result of usual lookups seems much more manageable.  That way only
> information the user is likely to use is cached and it's done with the
> frequency of and as a by-product of existing lookups.
>
> Thots?

Use OpenLDAP's nssov overlay plus your choice of either proxycache or
syncrepl. Both will work fine; your choice depends on whether the disconnected
machine is a single-user machine (then just use proxycache) or a multi-user
machine (then you might want to use syncrepl instead).

The proxycache features added in OpenLDAP 2.4.18 were designed specifically
for caching pam/nss lookups for disconnected operation.
--
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

Re: disconnected nss_ldap

by Ryan B. Lynch :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Fri, Oct 23, 2009 at 22:49, Brian J. Murrell <brian@...> wrote:
> At the risk of asking a FAQ (but in my defence, I have been googling
> this off and on for the last 2-3 weeks) how does one properly handle
> computers (i.e. laptops) that should get their NSS information from LDAP
> while connected to the corporate network and yet still function while
> away from the corporate network?
...

>
> Surely others have run into this same problem.  How did you solve it?
>
> BTW: I am aware of nss_updatedb, but that seems a little clunky and
> heavy handed with it's "cache everything" and rigid (i.e. time of day
> driven) update schedule.  For such reasons I have read frequently that
> it really just doesn't scale.  An nss_updatedb that is updated as a
> result of usual lookups seems much more manageable.  That way only
> information the user is likely to use is cached and it's done with the
> frequency of and as a by-product of existing lookups.

Do you know about NSCD (the Name Service Caching Daemon)? It's built
to handle this kind of thing, and a lot of distros (Fedora/RH/CentOS,
at least) include it by default with the Glibc package. But it usually
isn't running by default.

-Ryan

Re: disconnected nss_ldap

by Ryan B. Lynch :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sat, Oct 24, 2009 at 00:09, Ryan Lynch <ryan.b.lynch@...> wrote:

> On Fri, Oct 23, 2009 at 22:49, Brian J. Murrell <brian@...> wrote:
>> At the risk of asking a FAQ (but in my defence, I have been googling
>> this off and on for the last 2-3 weeks) how does one properly handle
>> computers (i.e. laptops) that should get their NSS information from LDAP
>> while connected to the corporate network and yet still function while
>> away from the corporate network?
> ...
>>
>> Surely others have run into this same problem.  How did you solve it?
>>
>> BTW: I am aware of nss_updatedb, but that seems a little clunky and
>> heavy handed with it's "cache everything" and rigid (i.e. time of day
>> driven) update schedule.  For such reasons I have read frequently that
>> it really just doesn't scale.  An nss_updatedb that is updated as a
>> result of usual lookups seems much more manageable.  That way only
>> information the user is likely to use is cached and it's done with the
>> frequency of and as a by-product of existing lookups.
>
> Do you know about NSCD (the Name Service Caching Daemon)? It's built
> to handle this kind of thing, and a lot of distros (Fedora/RH/CentOS,
> at least) include it by default with the Glibc package. But it usually
> isn't running by default.


My bad, I just realized that you DID mention nscd--I need to learn to read.

But I think nscd actually has the feature that you want--are you
familiar with the 'reload-count' option? It lets you limit the number
timeout-cycles that the daemon will tolerate before it throws out a
cached entry.

For example, given your desired scenario of a 10-minute cache TTL, and
a 30 day hard timeout, you could set:

  positive-time-to-live 600      # 10 minutes
  reload-count 4320               # 30 days / 10 minutes

If the cached value is more than 10 minutes old, 'nscd' will try to
refresh it. If it fails to connect, it will re-set the 10-minute TTL
and increment its reload counter by 1. This cycle repeats until the
reload counter reaches 4,320, when it just throws out the cached
entry, entirely.  (I don't actually know whether 'nscd' will
automatically try to refresh the cached entry every 10 minutes, or if
it only tries when the name is requested... That probably deserves an
experiment, because it could have big implications for the actual hard
limit you'd see.)

I actually use 'reload-count unlimited' to cache LDAP (AD, actually)
users and groups. It works fine for laptops with domain accounts. With
pam_ccreds, it pretty much works just like a local account would.

-Ryan

Re: disconnected nss_ldap

by Brian J. Murrell :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Fri, 2009-10-23 at 20:36 -0700, Howard Chu wrote:
>
> Use OpenLDAP's nssov overlay plus your choice of either proxycache or
> syncrepl. Both will work fine; your choice depends on whether the disconnected
> machine is a single-user machine (then just use proxycache) or a multi-user
> machine (then you might want to use syncrepl instead).

So, some googlin' given that this nssov is new to me... it seems that I
run a full fledged LDAP server (slapd) on every client?

Wow.  That seems a might overkill also.  Workstations are already so
overly bloated, adding an LDAP server just to deal with disconnected use
just seems like over-engineering the problem.

Maybe I am just mis-understanding it all.

b.



signature.asc (204 bytes) Download Attachment

Re: disconnected nss_ldap

by Brian J. Murrell :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sat, 2009-10-24 at 00:09 -0400, Ryan Lynch wrote:
>
> Do you know about NSCD (the Name Service Caching Daemon)?

Wow.  I'm trying really hard not to be rude here, but did you bother to
read my original posting?  I will quote here for you what I said about
NSCD:

> > I realize that caching is what is needed here and I have looked into
> > nscd for this, using it's persistent storage feature, but it just
> > doesn't seem to be thought out well enough from the temporarily
> > disconnected use-case.  It seems that nscd needs two timeouts.  One
> > at which it will try to refresh a stale entry and a second at which
> > it will expire a stale entry.  Reasonable times for the two would be
> > something on the order of 10 minutes and 30 days, respectively.

> It's built
> to handle this kind of thing,

Not really, it seems.  In practise anyway.  I have tried the recommended
"reload-count = unlimited" but as reported by another, it doesn't seem
to entirely solve the problem.  See
http://sourceware.org/bugzilla/show_bug.cgi?id=2132 for details.

The digested summary is that the above option does not appear to prevent
entries from being expired from the cache when the timeout is set
reasonably low (like several minutes).  Setting the timeout to some
god-awful huge value, like 30 days leads to nscd having stale data, even
when connected to the network.  Hence the proposal for two timeouts in
my original posting as well as in the above mentioned bug.

Do you actually use NSCD to solve this?  I'd be interested in your
experience (off-line as this is pretty OT for this list) as popular
experience with a proper configuration seems to be that it doesn't work.

b.




signature.asc (204 bytes) Download Attachment

Re: disconnected nss_ldap

by Brian J. Murrell :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sat, 2009-10-24 at 00:34 -0400, Ryan Lynch wrote:
>
> My bad, I just realized that you DID mention nscd--I need to learn to read.

Yeah.  :-)  Oh well.  Water under the bridge.

> But I think nscd actually has the feature that you want--are you
> familiar with the 'reload-count' option?

Yup.

> It lets you limit the number
> timeout-cycles that the daemon will tolerate before it throws out a
> cached entry.

Right.  But my experience is that even with unlimited, it doesn't take
long before the passwd entries are just gone.

But other than that, my experiments reveal that nss_ldap is called by
binaries, independently of querying nscd.  i.e. I try to log in while
the LDAP server is unavailable and get scads of messages
in /var/log/auth from nss_ldap that the ldap server is unavailable.
Such as:

Oct 24 01:26:09 brian-laptop-old sudo: nss_ldap: could not connect to any LDAP server as (null) - Can't contact LDAP server
Oct 24 01:26:09 brian-laptop-old sudo: nss_ldap: failed to bind to LDAP server ldap://ldap: Can't contact LDAP server
Oct 24 01:26:09 brian-laptop-old sudo: nss_ldap: reconnecting to LDAP server...
Oct 24 01:26:09 brian-laptop-old sudo: nss_ldap: could not connect to any LDAP server as (null) - Can't contact LDAP server
Oct 24 01:26:09 brian-laptop-old sudo: nss_ldap: failed to bind to LDAP server ldap://ldap: Can't contact LDAP server
Oct 24 01:26:09 brian-laptop-old sudo: nss_ldap: reconnecting to LDAP server (sleeping 1 seconds)...
Oct 24 01:26:10 brian-laptop-old sudo: nss_ldap: could not connect to any LDAP server as (null) - Can't contact LDAP server
Oct 24 01:26:10 brian-laptop-old sudo: nss_ldap: failed to bind to LDAP server ldap://ldap: Can't contact LDAP server
Oct 24 01:26:10 brian-laptop-old sudo: nss_ldap: could not search LDAP server - Server is unavailable

In the case of sshd, I get much the same as the above, but the remote is
disconnected without even attempting an authentication:

Oct 24 01:34:09 brian-laptop-old sshd[20430]: nss_ldap: could not connect to any LDAP server as (null) - Can't contact LDAP server
Oct 24 01:34:09 brian-laptop-old sshd[20430]: nss_ldap: failed to bind to LDAP server ldap://ldap: Can't contact LDAP server
Oct 24 01:34:09 brian-laptop-old sshd[20430]: nss_ldap: reconnecting to LDAP server...
Oct 24 01:34:09 brian-laptop-old sshd[20430]: nss_ldap: could not connect to any LDAP server as (null) - Can't contact LDAP server
Oct 24 01:34:09 brian-laptop-old sshd[20430]: nss_ldap: failed to bind to LDAP server ldap://ldap: Can't contact LDAP server
Oct 24 01:34:09 brian-laptop-old sshd[20430]: nss_ldap: reconnecting to LDAP server (sleeping 1 seconds)...
Oct 24 01:34:10 brian-laptop-old sshd[20430]: nss_ldap: could not connect to any LDAP server as (null) - Can't contact LDAP server
Oct 24 01:34:10 brian-laptop-old sshd[20430]: nss_ldap: failed to bind to LDAP server ldap://ldap: Can't contact LDAP server
Oct 24 01:34:10 brian-laptop-old sshd[20430]: nss_ldap: could not search LDAP server - Server is unavailable

But as soon as the LDAP server is available again, ssh to the node works
just fine.

> For example, given your desired scenario of a 10-minute cache TTL, and
> a 30 day hard timeout, you could set:
>
>   positive-time-to-live 600      # 10 minutes
>   reload-count 4320               # 30 days / 10 minutes
>
> If the cached value is more than 10 minutes old, 'nscd' will try to
> refresh it. If it fails to connect, it will re-set the 10-minute TTL
> and increment its reload counter by 1. This cycle repeats until the
> reload counter reaches 4,320, when it just throws out the cached
> entry, entirely.
Indeed.  My experiments were that even with unlimited, the passwd entry
for the current, logged in user disappeared.  I was going to demonstrate
on my Ubuntu Karmic laptop but I can't seem to reproduce this here.
Maybe this was a problem only on the Jaunty laptop that I was trying
previously.

> I actually use 'reload-count unlimited' to cache LDAP (AD, actually)
> users and groups. It works fine for laptops with domain accounts. With
> pam_ccreds, it pretty much works just like a local account would.

That's exactly what I am aiming for as well.

Cheers, and thanks for the update to your last post.

We should probably take this NSCD discussion offline as it's really OT
here.  Although, evidence is that, on Karmic anyway, it's working and
it's nss_ldap that is giving me grief when I am disconnected.

b.



signature.asc (204 bytes) Download Attachment

Re: Re: disconnected nss_ldap

by Howard Chu :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Brian J. Murrell wrote:

> On Fri, 2009-10-23 at 20:36 -0700, Howard Chu wrote:
>>
>> Use OpenLDAP's nssov overlay plus your choice of either proxycache or
>> syncrepl. Both will work fine; your choice depends on whether the disconnected
>> machine is a single-user machine (then just use proxycache) or a multi-user
>> machine (then you might want to use syncrepl instead).
>
> So, some googlin' given that this nssov is new to me... it seems that I
> run a full fledged LDAP server (slapd) on every client?
>
> Wow.  That seems a might overkill also.  Workstations are already so
> overly bloated, adding an LDAP server just to deal with disconnected use
> just seems like over-engineering the problem.

OpenLDAP is probably the least bloated solution you'll find. I have it running
on my G1 phone, the process size is only 1.5MB. See how big all those other
solutions are when configured as well as possible, that still don't solve the
actual problem. Plus it's remotely configurable, which makes it far more
manageable than any other approach...

--
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

Re: disconnected nss_ldap

by Brian J. Murrell :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sat, 2009-10-24 at 01:38 -0400, Brian J. Murrell wrote:
>
> But as soon as the LDAP server is available again, ssh to the node works
> just fine.

I fixed this.  This is because of pam_unix's account mode.  It wants to
verify the shadow entry when the passwd entry contains a "x" for the
password -- hence my previous thread about fixing this in nss_ldap.
Adding broken_shadow to pam_unix's entry in the account mode works
around it.

> Indeed.  My experiments were that even with unlimited, the passwd entry
> for the current, logged in user disappeared.  I was going to demonstrate
> on my Ubuntu Karmic laptop but I can't seem to reproduce this here.

I spoke too soon/didn't wait long enough.

Witness my laptop, where I am logged in (as brian), have nscd running
with:
        reload-count unlimited
        positive-time-to-live passwd 60

$ id brian
id: brian: No such user

I also have a user "keith" in my LDAP directory mapped into the NSS
passwd map which I was testing with before when I thought it was
working.  All this to say that "keith" should definitely be in nscd's
persistent cache as I was executing "id keith" repeatedly, watching for
it to disappear, and now, like the "brian" entry, it has:

$ id keith
id: keith: No such user

So for whatever reason, NSCD is expiring entries from it's persistent
cache despite the "reload-count unlimited".  ~sigh~

b.



signature.asc (204 bytes) Download Attachment

Re: Re: disconnected nss_ldap

by Brian J. Murrell :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Fri, 2009-10-23 at 22:40 -0700, Howard Chu wrote:
> OpenLDAP is probably the least bloated solution you'll find.

Probably so.  Probably just a prejudice, with it being a "database"
server that it will become a big footprint.

Maybe just seeing it's footprint on my server (where it's serving up a
smallish NSS data set) is skewing my opinion:

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
30999 openldap  20   0 98.6m 5796 2676 S    0  0.9  18:26.51 slapd

> I have it running
> on my G1 phone, the process size is only 1.5MB.

I wonder how big it would be on my laptop, with an empty database.

> See how big all those other
> solutions are when configured as well as possible, that still don't solve the
> actual problem. Plus it's remotely configurable, which makes it far more
> manageable than any other approach...

All good points.

b.



signature.asc (204 bytes) Download Attachment

Re: Re: Re: disconnected nss_ldap

by Howard Chu :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Brian J. Murrell wrote:
> On Fri, 2009-10-23 at 22:40 -0700, Howard Chu wrote:
>> I have it running
>> on my G1 phone, the process size is only 1.5MB.
>
> I wonder how big it would be on my laptop, with an empty database.

It helps to tweak the config, of course. By default slapd uses up to 16
threads; on my phone I have it configured for only 2 threads. In practice,
unless someone else is querying it remotely, it won't ever receive queries
from more than 1 app at a time.

--
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

Re: Re: disconnected nss_ldap

by Ryan B. Lynch :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sat, Oct 24, 2009 at 01:38, Brian J. Murrell <brian@...> wrote:
> On Sat, 2009-10-24 at 00:34 -0400, Ryan Lynch wrote:
>>
>> My bad, I just realized that you DID mention nscd--I need to learn to read.
>
> Yeah.  :-)  Oh well.  Water under the bridge.

Whatever, man. Free tech support is free tech support, try to loosen
up a little, eh?


> But other than that, my experiments reveal that nss_ldap is called by
> binaries, independently of querying nscd.  i.e. I try to log in while
> the LDAP server is unavailable and get scads of messages
> in /var/log/auth from nss_ldap that the ldap server is unavailable.
...
> But as soon as the LDAP server is available again, ssh to the node works
> just fine.

nscd and the name service switch arent' supposed to handle
authenticating users via LDAP binds. Authentication and authorization
are two totally separate chains of events.

You need to set up 'pam_ldap' and 'pam_ccreds', which will run in
parallel with 'nscd' and 'nss_ldap(d)'. nscd caches the group-to-GID
and user-to-UID mappings, and 'pam_ccreds' caches the LDAP creds and
bind results.


>> For example, given your desired scenario of a 10-minute cache TTL, and
>> a 30 day hard timeout, you could set:
>>
>>   positive-time-to-live 600      # 10 minutes
>>   reload-count 4320               # 30 days / 10 minutes
>>
>> If the cached value is more than 10 minutes old, 'nscd' will try to
>> refresh it. If it fails to connect, it will re-set the 10-minute TTL
>> and increment its reload counter by 1. This cycle repeats until the
>> reload counter reaches 4,320, when it just throws out the cached
>> entry, entirely.
>
> Indeed.  My experiments were that even with unlimited, the passwd entry
> for the current, logged in user disappeared.  I was going to demonstrate
> on my Ubuntu Karmic laptop but I can't seem to reproduce this here.
> Maybe this was a problem only on the Jaunty laptop that I was trying
> previously.

I can't speak to Ubuntu-specific issues, I don't have a lot of
experience there, but I've seen a decent number of bugs in the PADL
suite and nscd, in the last few years. Maybe Launchpad has a ticket
from between those two releases that explains the difference?


>> I actually use 'reload-count unlimited' to cache LDAP (AD, actually)
>> users and groups. It works fine for laptops with domain accounts. With
>> pam_ccreds, it pretty much works just like a local account would.
>
> That's exactly what I am aiming for as well.
>
> Cheers, and thanks for the update to your last post.
>
> We should probably take this NSCD discussion offline as it's really OT
> here.  Although, evidence is that, on Karmic anyway, it's working and
> it's nss_ldap that is giving me grief when I am disconnected.

Can I suggest something? If you haven't already gotten in touch with
someone who's using LDAP authen and authn caching (pam_ldap and
pam_ccreds), it might be worthwhile to re-phrase that issue as a
separate question on the list. I can show you how I do authen, but my
bag is all Kerberos, and it sounds like you're probably headed for an
all-LDAP setup.

-Ryan

Re: Re: disconnected nss_ldap

by Ryan B. Lynch :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sat, Oct 24, 2009 at 02:17, Brian J. Murrell <brian@...> wrote:

> On Sat, 2009-10-24 at 01:38 -0400, Brian J. Murrell wrote:
>>
>> But as soon as the LDAP server is available again, ssh to the node works
>> just fine.
>
> I fixed this.  This is because of pam_unix's account mode.  It wants to
> verify the shadow entry when the passwd entry contains a "x" for the
> password -- hence my previous thread about fixing this in nss_ldap.
> Adding broken_shadow to pam_unix's entry in the account mode works
> around it.

So nscd IS caching shadow info (password hashes), for you? I didn't
think it would handle that, but I guess it makes sense. In that case,
I'm not sure if there's an advantage to useing 'pam_ccreds' and
'pam_ldap' over nscd's shadow caching.


>> Indeed.  My experiments were that even with unlimited, the passwd entry
>> for the current, logged in user disappeared.  I was going to demonstrate
>> on my Ubuntu Karmic laptop but I can't seem to reproduce this here.
>
> I spoke too soon/didn't wait long enough.
>
> Witness my laptop, where I am logged in (as brian), have nscd running
> with:
>        reload-count            unlimited
>        positive-time-to-live   passwd          60
>
> $ id brian
> id: brian: No such user
>
> I also have a user "keith" in my LDAP directory mapped into the NSS
> passwd map which I was testing with before when I thought it was
> working.  All this to say that "keith" should definitely be in nscd's
> persistent cache as I was executing "id keith" repeatedly, watching for
> it to disappear, and now, like the "brian" entry, it has:
>
> $ id keith
> id: keith: No such user
>
> So for whatever reason, NSCD is expiring entries from it's persistent
> cache despite the "reload-count unlimited".  ~sigh~

I'm using some pretty high TTLs on disconnected machines, and some
Kerberos house-keeping scripts that generally make sure nscd's cache
gets hit more often than the TTL time. There's a reason for the
enormous TTLs, too, which it looks like you may have already
discovered?

    http://sources.redhat.com/bugzilla/show_bug.cgi?id=2132

nscd drops a cached name if the TTL expires without it b eing
requested, regardless of the 'reload-count' setting. To effectively
use it for disconnected operations, you need to be reasonably certain
that some local activity will trigger a lookup on each cached name
more often than the TTL time. So basically, you have to set your TTLs
pretty high, or you need to convince Ulrich Drepper to make nscd
smarter.

-Ryan

Re: Re: disconnected nss_ldap

by Ryan B. Lynch :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sat, Oct 24, 2009 at 03:49, Ryan Lynch <ryan.b.lynch@...> wrote:

> On Sat, Oct 24, 2009 at 02:17, Brian J. Murrell <brian@...> wrote:
>> On Sat, 2009-10-24 at 01:38 -0400, Brian J. Murrell wrote:
>>>
>>> But as soon as the LDAP server is available again, ssh to the node works
>>> just fine.
>>
>> I fixed this.  This is because of pam_unix's account mode.  It wants to
>> verify the shadow entry when the passwd entry contains a "x" for the
>> password -- hence my previous thread about fixing this in nss_ldap.
>> Adding broken_shadow to pam_unix's entry in the account mode works
>> around it.
>
> So nscd IS caching shadow info (password hashes), for you? I didn't
> think it would handle that, but I guess it makes sense. In that case,
> I'm not sure if there's an advantage to useing 'pam_ccreds' and
> 'pam_ldap' over nscd's shadow caching.

Wrong again--I just noticed your other thread, where you mentioned
that you're using Kerberos to authenticate. I had no idea, I thought
you were doing pure LDAP.

Re: Re: disconnected nss_ldap

by Brian J. Murrell :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sat, 2009-10-24 at 03:49 -0400, Ryan Lynch wrote:
> So nscd IS caching shadow info (password hashes), for you?

As you discovered, no.  I am using kerberos for authentication.

> I'm using some pretty high TTLs on disconnected machines,

The problem with high TTLs is that changes you make to your LDAP NSS
data takes too long (i.e. the TTL -- which needs to be like 30 days to
avoid dropping entries before you really want to) to get updated to the
nscd-running machines despite being connected to the LDAP server.

> There's a reason for the
> enormous TTLs, too, which it looks like you may have already
> discovered?
>
>     http://sources.redhat.com/bugzilla/show_bug.cgi?id=2132

Yeah.  I talked about this bug in one of my previous posts here and
also, as you've probably noticed, I commented on that bug, but Mr.
Drepper seems to be simply ignoring the real-world evidence that his
proposed solution, "reload-count unlimited" just doesn't work.

> nscd drops a cached name if the TTL expires without it b eing
> requested, regardless of the 'reload-count' setting.

Yeah.  So what's the point of "reload-count", then really, yes?

> To effectively
> use it for disconnected operations, you need to be reasonably certain
> that some local activity will trigger a lookup on each cached name
> more often than the TTL time.

Which is just silly.

> So basically, you have to set your TTLs
> pretty high,

And let your cached data get stale, despite having easy access to the
fresh data.

> or you need to convince Ulrich Drepper to make nscd
> smarter.

Well, bug 2132 sure doesn't give anyone any warm fuzzies that he's
actually willing to listen to how nscd works (or doesn't as the case may
be) in the real world vs. how he thinks it's suppose to operate.  He is
simply ignoring the evidence that demonstrates that he's wrong.

I wonder how difficult the fix is to not drop records who's TTL expires
before they are re-requested.  I can't imagine terribly so.  I wonder if
he'd be more (or at all) responsive to a patch than he has been to the
presentation of evidence that his solution simply doesn't work in the
real world.

b.



signature.asc (204 bytes) Download Attachment

Re: Re: disconnected nss_ldap

by Brian J. Murrell :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sat, 2009-10-24 at 03:16 -0400, Ryan Lynch wrote:
>
> nscd and the name service switch arent' supposed to handle
> authenticating users via LDAP binds.

They are not.  It was the pam_unix modules "account" mode that was
refusing the access because when the password map returns an "x" in the
password field, a shadow entry must be available for pam_unix to verify
the expiry status of the password.

When you are disconnected from the LDAP server you don't have shadow
entries so pam_unix fails the "account" checks which denies login.

This is why I started the other thread about having nss_ldap not return
a "x" for the password when the shadow map is not available/desired.
Which just happens to be all of the time when you are using Kerberos.

>  Authentication and authorization
> are two totally separate chains of events.

Understood, very well.

> You need to set up 'pam_ldap' and 'pam_ccreds', which will run in
> parallel with 'nscd' and 'nss_ldap(d)'.

But neither of those deals with the shadow map problem.

> nscd caches the group-to-GID
> and user-to-UID mappings, and 'pam_ccreds' caches the LDAP creds and
> bind results.

Right.  And nothing caches the shadow map for pam_unix's account module,
hence the need for the "broken_shadow" hack or more properly the ability
to disable the "x" in the password field of the passwd map on
configurations that don't really need or want shadow functionality.

> I can't speak to Ubuntu-specific issues, I don't have a lot of
> experience there, but I've seen a decent number of bugs in the PADL
> suite and nscd, in the last few years. Maybe Launchpad has a ticket
> from between those two releases that explains the difference?

No, it's quite easily the bug you mentioned earlier in that something
needs to probe all cache entries at least once per TTL or they get
dropped.

> Can I suggest something? If you haven't already gotten in touch with
> someone who's using LDAP authen and authn caching (pam_ldap and
> pam_ccreds), it might be worthwhile to re-phrase that issue as a
> separate question on the list. I can show you how I do authen, but my
> bag is all Kerberos, and it sounds like you're probably headed for an
> all-LDAP setup.

No.  I authenticate with Kerberos as well.  And everything works just
fine for all of my clients except the disconnected ones (only when
disconnected of course), so I have everything set up as I need it.  It's
just this nscd and dropping cached entries when it shouldn't be
silliness that is punching a hole in the solution.

nscd really does seem like it would complete the solution if it didn't
suffer from redhat bug 2132.

Cheers,
b.



signature.asc (204 bytes) Download Attachment

RE: Re: Re: disconnected nss_ldap

by Howard Wilkinson :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Brian et al,
 
I think the problem with the nscd issue may be a bug in nss_ldap's interface with the nsswitch interface.
 
.......

               
                nscd really does seem like it would complete the solution if it didn't
                suffer from redhat bug 2132.
               
                Cheers,
                b.

I have looked into the nss_ldap code and it responds with NSS_STATUS_UNAVAIL, errno = EPERM for the following cases.

LDAP_SERVER_DOWN, LDAP_TIMEOUT, LDAP_UNAVAILABLE, LDAP_BUSY, LDAP_CONNECT_ERROR, LDAP_LOCAL_ERROR, LDAP_INVALID_CREDENTIALS.

The last 2 are I suspect correct but the first 5 are really candidates for 'server has gone away'. I suspect, but can't quite decide whether I am right, that the code should respond with either NSS_STATUS_TRYAGAIN, errno != ERANGE, or NSS_STATUS_UNAVAIL, errno = EAGAIN for the cache to continue to be populated with the entry.

If anybody who understands the nsswitch internals can confirm which is the correct response I will patch the nss_ldap library (I have half a patch already) and try this out.

Howard.

 


Re: Re: Re: disconnected nss_ldap

by Ryan B. Lynch :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Howard,

On Tue, Oct 27, 2009 at 08:24, Howard Wilkinson <howard@...> wrote:
> If anybody who understands the nsswitch internals can confirm which is the correct response I will patch the nss_ldap library (I have half a patch already) and try this out.

I'm in a position to test patches for this, even if they're a bit
rough--I have a couple of throwaway VMs specifically intended for
this. Feel free to send anything you have, I'd love to see this issue
resolved, soon.

Also, will your patch for this issue sit on top of your "mega" patch,
or on the unpatched PADL tree?

-Ryan

RE: Re: Re: disconnected nss_ldap

by Howard Wilkinson :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Ryan,
 

                Howard,
               
                On Tue, Oct 27, 2009 at 08:24, Howard Wilkinson <howard@...> wrote:
                > If anybody who understands the nsswitch internals can confirm which is the correct response I will patch the nss_ldap library (I have half a patch already) and try this out.
               
                I'm in a position to test patches for this, even if they're a bit
                rough--I have a couple of throwaway VMs specifically intended for
                this. Feel free to send anything you have, I'd love to see this issue
                resolved, soon.
                 

I am working on this now and hope to have something out today. The internals of nss_ldap are a bit of mess in this area, but I think I have a handle on it.



                Also, will your patch for this issue sit on top of your "mega" patch,
                or on the unpatched PADL tree?
               
                -Ryan
               

This will have to go on the top of the mega patch as the original code is even worse in this area..... ;-(
 
Howard.
 

Re: Re: Re: disconnected nss_ldap

by Ryan B. Lynch :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, Oct 27, 2009 at 10:35, Howard Wilkinson <howard@...> wrote:
> I am working on this now and hope to have something out today. The internals of nss_ldap are a bit of mess in this area, but I think I have a handle on it.

Fire when ready.

> This will have to go on the top of the mega patch as the original code is even worse in this area..... ;-(

That's good--I was in the process of rebuilding RPMs with your latest
mega rev when I saw your original message, so I can save a little time
testing both at once.

-Ryan
< Prev | 1 - 2 | Next >