Re: [SCM] Samba Shared Repository - branch master updated - release-4-0-0alpha8-205-g7119241

View: New views
8 Messages — Rating Filter:   Alert me  
< Prev | 1 - 2 - 3 - 4 | Next >

Re: the sorry saga of the talloc soname 'fix'

by Sam Liddicott :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

* tridge wrote, On 09/07/09 11:02:
> Hi Sam,
>
> My apologies for not having done a proper analysis of your proposal.
>
> I have just tried to apply your patches to a test tree, but they fail
> to apply, and the fixups do not look trivial. Could you send me a
> clean copy of talloc.[ch] and testsuite.c for me to look at?
>  
Yes. I don't know why they failed for you, they apply cleanly to 22 June
master:
e129384d7c1df664e447186673dd107e190e2894

perhaps you need to revert you patches, but I attach the files you ask for.

> I do have some comments based on your description of the patch though,
> although whether those comments are useful depends on my correct
> interpretation of your text description of the changes. I always
> prefer to work from real code, as descriptions are rarely sufficient,
> especially with such a complex change.
>
> ok, so some wild specualation based on your description:
>
>  1) you propose that talloc_free() be allowed with a reference is
>  present, but that it should always free the 'original' or 'real'
>  parent. That is exactly the opposite behaviour of what talloc has
>  done since talloc_reference() was added, so changing it would be
>  rather a surprise to existing code.
yes, I sadly consider that this code is already broken through the
ambiguity that is currently generally recognized, and which has the
consequences of leaving dangling references.
> The existing behaviour, before my
>  recent changes, was to remove the most recent reference. I am always
>  reluctant to completely invert existing behaviour. To prevent
>  ambiguous calls is one thing, especially when a clear warning is
>  given, but to do the exact opposite of what has been documented seems
>  to be asking for trouble. I certainly wouldn't call it a ABI (or
>  API) compatible change.
>  
I think this is not a complete inversion of the old behaviour, as I
think free cannot be called more than once (unless there was an
intevening steal). Therefore free isn't "removing the oldest parent
first" as such (which would be a complete inversion) it is instead
removing ONLY the allocating parent. free is a way for the parent to
abandon the child but let the child live off it's references.

I had never imagined that code would call talloc_free intending to free
a reference more recent that the allocating parent. The thought scares
me; however...
>  2) I think, from your description, that talloc_steal would change to
>  become a function that sometimes changes the effective reference
>  count and sometimes doesn't.
yes, because "no_owner" doesn't count as a reference.
Also, no amount of consecutive calls to free can have a combined effect
of removing more than one reference.

> Let's look at an example:
>
>     int *p1 = talloc(NULL, int);
>     int *p2 = talloc(NULL, int);
>     int *p3 = talloc(NULL, int);
>     int *c1 = talloc(p1, int);
>     int *r1 = talloc_reference(p2, c1);
>
> at this point the effective reference count of c1 is 2. It will take 2
> free calls for it to go away.
>  
it will actually take: (one talloc_unlink(p2, c1)), and (one
talloc_free(c1) or talloc_unlink(p1, c1)).
Keep calling free won't make it go away - otherwise the r1 would become
a dangling reference which was the cause of me noticing all this.
> Now we do this:
>
>     talloc_free(r1);
>
> and if I have understood your patches correctly (a big if, given I
> can't run them!) then we end at this point with p1 having a reference
> to c1, but c1 also having a new type of pseudo-parent.
>  
yes
> At this point c1 has a effective reference count of 1, so a single
> free will get rid of it.
>  
no. A single free will do nothing, the reference will be left until p2
goes away or p2 explicitly unlinks itself; otherwise we get dangling
references and suffer the ambiguity that the "single free" would not
know which reference it was removing.
> Now we do this:
>
>     talloc_steal(p3, r1);
>
> and now, if I have understood your patches correctly, p3 becomes the
> 'real' parent, but the effective reference count of c1 has increased!
> It will now take two free operations for it to go away.

As above, it will take (one talloc_unlink), and (one talloc_free or
talloc_unlink).

> So in total we
> will need 3 frees to get rid of this pointer if you use a steal, and
> two frees if you don't use a steal.
>  
Yes, but the mismatch isn't as it appears, the context must be from the
perspective of writing code and knowing what to write, not from the
runtime perspective object being referenced - as it is too hard for
humans to weave their minds along the code paths followed by an object.

It follows these rules:

If you talloc you must make arrangements for free (directly or implicitly)
If you steal you must make arrangements for free.

It's actually impossible/undesirable to have to detect if the object you
have (which may only be alive because it is referenced) has been
attemptedly freed. If you steal, you are "adopting" without regard to
whether or not the original parent has given up, and so you have the
obligation to free.

In regular malloc/free code transfer of owership of a pointer needs no
special treatment because the pointer itself does not track the owner.
talloc introduces talloc_steal when such ownership is transferred
because the owner is tracked.

> So that means that talloc_steal() now will, sometimes, change the
> reference count of a pointer. That is not a property it has had
> before, and it is not a desirable property I think.
>  
I now see your objection. I think it is one of perception because you
refer to tc->parent as the "real parent" and feel that steal and free
operate on the real parent.

I tried to cover this in the talloc "reframing".

Consider that the pointer in tc-> parent has absolutely no
hierachical/tree significance, and that we enforce talloc and
talloc_steal to have to actually add the parent to tc->refs for it to be
a reference.

tc->parent merely becomes a convenience pointer in order to provide the
implicit argument for talloc_free and talloc_steal. Of course this
argument can be good for one use only! (talloc_steal replenishes it with
the new value).

That is how the ambiguity is removed while preserving the ability to use
talloc_free in all code that has the expectation of knowing who the
tc->parent is on account of the passing of ownership being implicit in
the code. This doesn't include code which says implicitly "I know who
all the references are and in what order" because this is too hard to know.

However, if tc->parent must always also be added to tc->refs we don't
actually have to go to the trouble of doing it.
> Have I interpreted the patches correctly, or should I wait until I can
> actually run the code?
>  

I think there was some confusion over repeated calls to free. I hope I
cleared these up.

Sam

Re: the sorry saga of the talloc soname 'fix'

by Sam Liddicott :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

* Stefan (metze) Metzmacher wrote, On 09/07/09 11:21:
> None of this 4 patches passes make test in the a standalone build,
> so there's nothing that can be judged :-(
>  
Thanks for pointing this out. I can't account for it as I recently did a
make test but I will look into it.
(I've had a samba4 make test fail if I accidentally built samba against
the already installed talloc)
> I've played with it here (cherry picked, reverted, cherry-picked
> combinations of the 4 patches)
> http://gitweb.samba.org/?p=metze/samba/wip.git;a=shortlog;h=refs/heads/master4-talloc-sam
>
> It's based on a version of talloc before tridge change it see:
> http://gitweb.samba.org/?p=metze/samba/wip.git;a=shortlog;h=refs/heads/master4-talloc-old
>  
thanks for trying all those ways, I don't understand why it failed
unless you were accidentally testing the installed pre-patched talloc
with the patched tests, but I'll follow your request below and provide 3
patches which work on e129384d7c1df664e447186673dd107e190e2894
> I would be really interested in a working patch
> (without old and ugly formated comments in it).
> Please provide the minimal patch to talloc.c to implement
> your desired behavior and then maybe a patch to fix existing tests
> (hopefully not needed)
needed
> and a patch that adds new tests.
>
>  
I'll provide these.

> I need something that I can apply, test and review without
> having to dig through unrelated changes, comments.
> And it should actually work at least for the standalone build,
> but also in the source4 and source3 'make test' cases.
>  
I'll also try the source3 make test.

Thank-you for your continued attention and help with this originally.

Sam

Re: the sorry saga of the talloc soname 'fix'

by simo-7 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thu, 2009-07-09 at 12:31 +0100, Sam Liddicott wrote:
>
> I think there was some confusion over repeated calls to free. I hope I
> cleared these up.

Sam, I am sorry to say that you aren't always very clear :-)
But that doesn't mean you may not be right.

On the current behavior after Tridge's patch:
---------------------------------------------

The side effect of the patch is that now talloc_steal() and
talloc_free() may return errors a lot more easily, and not for
catastrofic reasons (not for ENOMEM or for corrupted hierarchies/failing
destructors).

Returning errors in talloc_free()/talloc_steal(), would make
talloc_reference() almost impossible to use, a pariah, because you are
going to cause problems to *other* people code if you snatch a
reference.

talloc_steal() would be an especially bad case. If you don't check the
return, as it could lead to access to freed memory. Unchecked errors
from talloc_free() would lead to memory leaks.

But worst of all there is no recovery, if I get an error from
talloc_steal() or talloc_free() what should I do ? abort() ?
That would be quite insane IMHO.

talloc_reference() exist to keep an object alive when we *don't* know or
control its hierarchy/lifespan, if we *do* know/control the
hierarchy/lifetime then most probably you don't need talloc_reference().
So, basically, returning errors from talloc_free()/talloc_steal() kills
most of talloc_reference() utility.


On Sam's proposal:
------------------

What I like about this proposal (if I understand it) is that
talloc_free() and talloc_steal() do not return errors just because there
is a reference on an object, which is good IMO.


Let me try to sum up what a talloc with your changes would look like and
see if I got what you mean:

1. plain talloc()
Basically you say that any new allocation is also an implicit reference,
so when I allocate c1 on p1 then c1 has p1 as parent (which is just an
implicit *default* reference).

2. talloc_reference()
Here you say that talloc_reference() is just adding a reference to an
existing memory context. if you have talloc_reference(p2, c1) this means
c1 has just a new reference. the only way to remove this reference is by
calling talloc_unlink(p2, c1)
2.1 Q. does talloc_free(p2) also automatically free c1 if p2 was the
last reference for c1 but not the parent ? Or do you need to set an
explicit destructor on p2 to call talloc_unlink(p2,c1) ?

3. talloc_steal()
For normal cases talloc_steal(p3, c1) just changes the parent and at all
effects c1 changed ownership.
For cases where it would happen that c1 was alive only because of
additional references then talloc_steal ADDs a new parent.
Therefore you still need to talloc_free(c1) (or more frequently just p3)
and to talloc_unlink(p2, c1)
It's unlikely you are going to mix talloc_steal() and talloc_reference()
in the same functions anyway, so the case where a talloc_steal() is
called on a freed object is going to be rare.

4. talloc_free()
A free only removes the parent, it *never* promotes a reference to be
the new parent, if there are no more references the object is actually
really freed. If there are other references the object will not be freed
until each referencing object is actually freed or an explicit
talloc_unlink(ref, c1) is called.
If 2.1 holds true then I see no major problem here, if "all parents" are
freed the object is freed as well, and as long as one referencing parent
exists the object exists
4.1 What will happen if I call talloc_free() on an object that has no
parent ? Do we get an error ?

Problem:
How do you identify the parent as freed? NULL can't be used as it is a
valid parent. We may need to introduce a boolean:
tc->parent_allocated = true/false


Conclusions:
------------

Your patch would change fundamentally just one thing:
if you used to use talloc_free() to directly kill a reference then you
may get errors/memleaks depending on the implementation.
This is not going to be a huge problem in most cases and is going to be
contained in the whereabouts of the original talloc_reference() so that
it should be easy to *know* who is the parent to actually perform a
talloc_unlink()
(It may be nice to have a talloc_unreference() call perhaps to lessen
confusion for the API users).

It still means some code need to be changed, but that is equally true
with current Tridge's patch, the difference is that changes should be
localized around uses of talloc_reference() in most sane cases.

The other *good* difference is that only the code using
talloc_reference() will need to worry about the effects of
talloc_reference(), code not using it can keep being the usual simple
combination of talloc(), talloc_steal(), talloc_free() and no ill
effects will be felt if some foreign code takes a reference on some
memory object. The only effect will be (as expected) that the referenced
memory will be kept alive for some longer time.


Do I get this right ?

Simo.

--
Simo Sorce
Samba Team GPL Compliance Officer <simo@...>
Principal Software Engineer at Red Hat, Inc. <simo@...>


Re: the sorry saga of the talloc soname 'fix'

by Sam Liddicott :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

As a spoiler, Simo worked out what I was trying to say!
[cue festival music, etc]

* simo wrote, On 09/07/09 13:55:
> On Thu, 2009-07-09 at 12:31 +0100, Sam Liddicott wrote:
>  
>> I think there was some confusion over repeated calls to free. I hope I
>> cleared these up.
>>    
>
> Sam, I am sorry to say that you aren't always very clear :-)
>  
I've only just worked out what usage model of talloc is being followed
by the rest of you! (More on my enlightenment below)

> ...
>
> On Sam's proposal:
> ------------------
> ...
>
> 1. plain talloc()
> Basically you say that any new allocation is also an implicit reference,
> so when I allocate c1 on p1 then c1 has p1 as parent (which is just an
> implicit *default* reference).
>  
Yes. it could be an explicit reference, but why go to the trouble...
> 2. talloc_reference()
> Here you say that talloc_reference() is just adding a reference to an
> existing memory context. if you have talloc_reference(p2, c1) this means
> c1 has just a new reference. the only way to remove this reference is by
> calling talloc_unlink(p2, c1)
>  
Or automatically when p2 goes away (explicitly or when it's parents go
away etc)
> 2.1 Q. does talloc_free(p2) also automatically free c1 if p2 was the
> last reference for c1 but not the parent ? Or do you need to set an
> explicit destructor on p2 to call talloc_unlink(p2,c1) ?
>  
if p2 was the last reference for c1 (meaning c1's implicit reference in
tc->parent is also gone) then c1 will be freed automatically.

> 3. talloc_steal()
> For normal cases talloc_steal(p3, c1) just changes the parent and at all
> effects c1 changed ownership.
> For cases where it would happen that c1 was alive only because of
> additional references then talloc_steal ADDs a new parent.
>  
I would say talloc_steal always adds a new parent, it just doesn't
remove the old one because it's already been removed. But yes, there is
one more parent than before. The new parent is an implicit reference as
it is held in tc->parent.
> Therefore you still need to talloc_free(c1) (or more frequently just p3)
> and to talloc_unlink(p2, c1)
>  
yes. But you don't need to think about as long as each stealer knows it
has a free call to match, and each talloc knows it has a stealer or free
call to match.
> It's unlikely you are going to mix talloc_steal() and talloc_reference()
> in the same functions anyway, so the case where a talloc_steal() is
> called on a freed object is going to be rare.
>  
yes.
> 4. talloc_free()
> A free only removes the parent, it *never* promotes a reference to be
> the new parent, if there are no more references the object is actually
> really freed. If there are other references the object will not be freed
> until each referencing object is actually freed or an explicit
> talloc_unlink(ref, c1) is called.
>  
yes
> If 2.1 holds true then I see no major problem here, if "all parents" are
> freed the object is freed as well, and as long as one referencing parent
> exists the object exists
>  
yes
> 4.1 What will happen if I call talloc_free() on an object that has no
> parent ? Do we get an error ?
>  
On reflection, I think we should. If not, it would only be a "lucky
reference" that stopped what would have been a double-free.
> Problem:
> How do you identify the parent as freed? NULL can't be used as it is a
> valid parent. We may need to introduce a boolean:
> tc->parent_allocated = true/false
>  
I tried using a flag, but in the end went for a special singleton
no_owner_context owner, where no_owner_context is an internal talloc
context.
>
> Conclusions:
> ------------
>
> Your patch would change fundamentally just one thing:
> if you used to use talloc_free() to directly kill a reference then you
> may get errors/memleaks depending on the implementation.
>  
yes. And code that wasn't getting these errors or leaks might start to
get them just because other code has started taking references or is now
taking references at a different time. Which is very possible in an
asynchronous server.
> This is not going to be a huge problem in most cases and is going to be
> contained in the whereabouts of the original talloc_reference() so that
> it should be easy to *know* who is the parent to actually perform a
> talloc_unlink()
> (It may be nice to have a talloc_unreference() call perhaps to lessen
> confusion for the API users).
>  
It would be nice.
> It still means some code need to be changed, but that is equally true
> with current Tridge's patch, the difference is that changes should be
> localized around uses of talloc_reference() in most sane cases.
>
> The other *good* difference is that only the code using
> talloc_reference() will need to worry about the effects of
> talloc_reference(),
yes
> code not using it can keep being the usual simple
> combination of talloc(), talloc_steal(), talloc_free() and no ill
> effects will be felt if some foreign code takes a reference on some
> memory object.
which is exactly what I've been doing
>  The only effect will be (as expected) that the referenced
> memory will be kept alive for some longer time.
>  
yes. The other effect might be that the destructor gets called later. If
there were non-object related destruction effects it might be
troublesome if these were called later. I think such would be bad form
anyway.
>
> Do I get this right ?
>  
You do!
I owe you a meal over which you can tell me what I might have done to
explain better...

I've been enlightened into what was going on;

talloc users hoped to do:

ref = talloc_reference(me, thing);

and then later be able to do:

talloc_free(ref);

The pointer ref is seen as having life in it's own right; I had only
seen "ref" as a being the same as "thing" (except segfaultingly NULL if
the reference failed) and therefore needing to be un-referenced rather
than freed.


I confess that I had never imagined this as it seems so unhealthy in
inherently risky.

I guess if ref were some kind of C++ auto-dereferencing pointer it could
be made to work, but now I see the reasons why talloc_free would remove
the latest reference in the hopes that it was the right one.

I see that from the point of view of this usage model it would have been
hard to work out the benefit of what I was suggesting.

I thank all those who made the effort to understand, and Simo for
succeeding, Metze and Volker and many others for letting me confuse them
repeatedly, and Tridge for writing talloc (which is pretty sweet), etc etc

Sam


Re: the sorry saga of the talloc soname 'fix'

by Sam Liddicott :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

* Stefan (metze) Metzmacher wrote, On 09/07/09 11:21:
> None of this 4 patches passes make test in the a standalone build,
> so there's nothing that can be judged :-(
One of the tests was supposed to fail:
    /* The ugly talloc de-child-looping code will delete p2's reference
       leaving p2 having a dangling pointer. p2's reference should remain */

It still requires some better memory loop management code to be written,
the test was failing in anticipation of that.
I'll fix up that test to expect current behaviour until we have the loop
discussion some other week after we all had a rest.

I'm re-factoring the patches as you asked and will post them soon.

Sam

Re: the sorry saga of the talloc soname 'fix'

by Volker Lendecke :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thu, Jul 09, 2009 at 08:55:34AM -0400, simo wrote:
> talloc_steal() would be an especially bad case. If you don't check the
> return, as it could lead to access to freed memory. Unchecked errors
> from talloc_free() would lead to memory leaks.

Wait a second -- talloc_steal() can fail now?????

How on earth shall we deal with that??

Volker


signature.asc (204 bytes) Download Attachment

Re: the sorry saga of the talloc soname 'fix'

by simo-7 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thu, 2009-07-09 at 15:40 +0200, Volker Lendecke wrote:
> On Thu, Jul 09, 2009 at 08:55:34AM -0400, simo wrote:
> > talloc_steal() would be an especially bad case. If you don't check the
> > return, as it could lead to access to freed memory. Unchecked errors
> > from talloc_free() would lead to memory leaks.
>
> Wait a second -- talloc_steal() can fail now?????
>
> How on earth shall we deal with that??

Yes when I realized that I was a bit shocked too.

That's what made me look a bit better at Sam's proposal.

I now think that Sam's proposal is saner than Tridge's solution.

Simo.

--
Simo Sorce
Samba Team GPL Compliance Officer <simo@...>
Principal Software Engineer at Red Hat, Inc. <simo@...>


Re: the sorry saga of the talloc soname 'fix'

by Sam Liddicott :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

* Volker Lendecke wrote, On 09/07/09 14:40:
> On Thu, Jul 09, 2009 at 08:55:34AM -0400, simo wrote:
>  
>> talloc_steal() would be an especially bad case. If you don't check the
>> return, as it could lead to access to freed memory. Unchecked errors
>> from talloc_free() would lead to memory leaks.
>>    
>
> Wait a second -- talloc_steal() can fail now?????
>  
It will fail no more than it could before.
The only line of talloc_steal affected in the patch was shameful removal
of white space.

Sam
< Prev | 1 - 2 - 3 - 4 | Next >