Re: [SCM] Samba Shared Repository - branch master updated - release-4-0-0alpha8-205-g7119241

View: New views
20 Messages — Rating Filter:   Alert me  
< Prev | 1 - 2 - 3 - 4 | Next >

Re: the sorry saga of the talloc soname 'fix'

by tridge@samba.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Steve,

If the symbol versioning can be done outside of the core code, then
fine. It could be in a lib/talloc/packaging/ directory for example. If
needed then that directory could even have a patch in it that modifies
talloc.c if that is absolutely required.

The problem I see with symbol versioning is that it is so system
specific and so complex to maintain. Symbol versioning on Solaris
(where it was invented) is different to symbol versioning on
Linux. From reading the treatise on shared library maintenance by
Ulrich Drepper it seems that symbol versioning even has architecture
specific aspects to it within Linux.

So to me it should not go in the core code. If you think it needs to
be in a common place so that deb and RPM based distros coordinate,
then put it in a packaging/ subdirectory.

In all of this please also remember just how much of the Samba
codebase is now in shared libs. The librpc code alone is over 500k
lines of code, including the generated code. That lib is used by
external projects, and may be called both directly and indirectly by
apps, just like talloc is.

Whatever method we establish for handling shared libs now is not just
for talloc. It applies to a very large proportion of the projects code
base. I do not want that entire code base constrained by external
shared lib concerns. I do not want the maintenance of the shared lib
ABI to become the responsibility of every contributor to Samba.

Cheers, Tridge

Re: Why shared libraries have version numbers

by Steve Langasek :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, Jul 07, 2009 at 02:59:16PM -0400, David Collier-Brown wrote:
>    I was being humorous: you're English is perfectly fine!

>    I was suggesting the first thing you do is tell the programmer
> using two versions of the same library to not do that!

There's no reason to think that a programmer is involved at all.  This kind
of thing can happen when half of the reverse-deps of a library have been
rebuilt and the other have not; perhaps because the left hand doesn't know
what the right hand is doing, perhaps because someone didn't realize that
other bits of the system were using the library.  So this is usually the
work of sysadmins and distro packagers, not programmers.

The traditional soname system does a good job, overall, of preventing users
from winding up with broken combinations that give mysterious failures, but
this is one case where the traditional soname isn't enough to do the job.

--
Steve Langasek                   Give me a lever long enough and a Free OS
Debian Developer                   to set it on, and I can move the world.
Ubuntu Developer                                    http://www.debian.org/
slangasek@...                                     vorlon@...

Re: the sorry saga of the talloc soname 'fix'

by Sam Liddicott :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Although the soname debate is important (and how we handle soname
bumping and ABI changes), I want to draw some attention back to the
talloc fix.

I believe these comments will also provide some relief for the immediate
soname problem.

* Andrew Bartlett wrote, On 07/07/09 08:10:

>
> I think Volker is right, that given we can easily (for some definition
> of easily) avoid changing the ABI, then it's an easier way out of this
> mess.  Had you proposed a more radical solution, there would perhaps
> have been less resistance :-)
>
> But I wonder if we really have preserved the ABI.  Given that
> talloc_free() and talloc_steal() now does a different thing, won't it
> now allow a use-after-free?  This would be an ABI change (even if not a
> signature change), and justify the change.
>
> This is more than just a cleanup of possible memory leaks, isn't it?
>  

Andrew; you make a good point.

This change only means that talloc_free merely has a different conflict
with talloc_reference.

Before this change talloc_reference/talloc_unreference had to be avoided
for fear of runtime talloc errors.

After this change, talloc_free must be avoided for the same fear.

I don't know about others views, but I don't think this is any kind of fix.

talloc_free is not safe to use, so we may as well get rid of it
altogether and thus fully justify the .so version bump.


However I have a suggestion, which if followed will avoid this problem,
and the immediate issues of this "sorry soname saga".

I really really think it is the solution to the current drawn out debate.

Really really.

As no-one has /shown/ any evidence of understanding what I've said
before or of how it helps or even if it makes a difference, I'm offering
a solution here. (The only objection I ever had to it was from tridge
who felt that it was no different from what we had before).

Please make the effort to understand what I suggest here, and then
please make a judgement on whether or not this true talloc tip will
solve the whole problem including the sorry soname saga in this instance.

[From my own point of view I'm going to let this go now, but it concerns
me to see this needless long debate, and a new talloc which I cannot
ship with it's current talloc_free restrictions]

*The true talloc tip...*

Before I explain, some points must be acknowledged:

   1. There exists much code that uses talloc_steal/talloc_free
      ambiguously.
   2. This code needs fixing anyway.
   3. It is being fixed with the aid of tridge's patch

Before I explain, these points must be understood:

   1. talloc_reference, talloc_reparent, talloc_unlink are pure and holy
      and have zero ambiguity
   2. talloc_steal and talloc_free are said to be ambiguous (herein lies
      the entire problem).

These are the features of the solution:

   1. To make talloc_steal and talloc_free unambiguous
   2. To preserve the utility of talloc
   3. To preserve the API signatures and ABI
   4. To avoid a soname bump

To understand the problem these points must be understood:

   1. talloc_steal and talloc_free are like talloc_reparent and
      talloc_unlink except that they have an implicit "parent" argument
   2. the current ambiguity comes from knowing which parent to use as
      the implicit argument
   3. the solution must remove ambiguity of the implicit argument

The solution is to:

    *make the implicit argument of talloc_steal and talloc_free be the
    original allocating parent*

if we do this, then there is absolutely no ambiguity [see point 3 below]

Implications of this solution

   1. The solution can remove ambiguity while preserving the API
      signatures and ABI
      Really really
   2. It does NOT make buggy use of talloc_free suddenly become bug free
      Faulty use of talloc_free and talloc_steal is will still be faulty
   3. Just because talloc_steal and talloc_free are not ambigious
      doesn't mean your program is not ambiguous.
      Use of talloc/talloc_free is no more error-prone than use
      malloc/free - which doesn't say a lot, but it is a benchmark.
      If your code is too complicated for malloc/free then you should
      not be using talloc/talloc_free but talloc/talloc_unlink.
   4. If you don't want the original allocating parent to have any
      special treatment then don't use talloc_free/talloc_steal
      you can't use them now anyway unless you can swear that no-one
      else is taking a reference to your object
   5. Arguably we didn't preserve the ABI as we slightly changed the
      meaning of talloc_steal/talloc_unfree
      We did remove an ambiguity though, one which caused many uses of
      talloc_steal/talloc_free to be broken

And, (here I really do start to repeat myself) we get all of this merely by:

   1. preventing the internal promotion of reference to parent
   2. using a "no_owner" place holder as parent when talloc_free is
      called while references are still held. This no_owner does not
      have the power to prevent destruction when the references all go.
      This no_owner can be stolen from.

The patches for this, including patches to the test suite can be found in:
   http://lists.samba.org/archive/samba-technical/2009-July/065483.html

The specific "no_owner" patch is the third attachment. Much old code in
that patch is commented out rather than removed because I was
anticipating some discussion on the way I implemented it. [I shall
remove the commented out code in my next rebase anyway as it "works for me"]

I don't intend to make these points again, but as said before, the only
person who showed any evidence of having read the solution didn't seem
to think it made any difference.

Sam

Re: the sorry saga of the talloc soname 'fix'

by ronnie sahlberg :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> Before I explain, some points must be acknowledged:
>
>   1. There exists much code that uses talloc_steal/talloc_free
>      ambiguously.
>   2. This code needs fixing anyway.
...
> These are the features of the solution:
>
>   1. To make talloc_steal and talloc_free unambiguous

There is nothing inherently ambigous with talloc_steal and talloc_free.
I use them a lot in CTDB and there is aboslutely no need to fix
anything regarding these functions or their use there.

These functions are NOT ambiguous by themself. They are only ambiguous
when they are used together with _reference/unreference. Never else.


ronnie sahlberg



On Wed, Jul 8, 2009 at 6:15 PM, Sam Liddicott<sam@...> wrote:

> Although the soname debate is important (and how we handle soname
> bumping and ABI changes), I want to draw some attention back to the
> talloc fix.
>
> I believe these comments will also provide some relief for the immediate
> soname problem.
>
> * Andrew Bartlett wrote, On 07/07/09 08:10:
>>
>> I think Volker is right, that given we can easily (for some definition
>> of easily) avoid changing the ABI, then it's an easier way out of this
>> mess.  Had you proposed a more radical solution, there would perhaps
>> have been less resistance :-)
>>
>> But I wonder if we really have preserved the ABI.  Given that
>> talloc_free() and talloc_steal() now does a different thing, won't it
>> now allow a use-after-free?  This would be an ABI change (even if not a
>> signature change), and justify the change.
>>
>> This is more than just a cleanup of possible memory leaks, isn't it?
>>
>
> Andrew; you make a good point.
>
> This change only means that talloc_free merely has a different conflict
> with talloc_reference.
>
> Before this change talloc_reference/talloc_unreference had to be avoided
> for fear of runtime talloc errors.
>
> After this change, talloc_free must be avoided for the same fear.
>
> I don't know about others views, but I don't think this is any kind of fix.
>
> talloc_free is not safe to use, so we may as well get rid of it
> altogether and thus fully justify the .so version bump.
>
>
> However I have a suggestion, which if followed will avoid this problem,
> and the immediate issues of this "sorry soname saga".
>
> I really really think it is the solution to the current drawn out debate.
>
> Really really.
>
> As no-one has /shown/ any evidence of understanding what I've said
> before or of how it helps or even if it makes a difference, I'm offering
> a solution here. (The only objection I ever had to it was from tridge
> who felt that it was no different from what we had before).
>
> Please make the effort to understand what I suggest here, and then
> please make a judgement on whether or not this true talloc tip will
> solve the whole problem including the sorry soname saga in this instance.
>
> [From my own point of view I'm going to let this go now, but it concerns
> me to see this needless long debate, and a new talloc which I cannot
> ship with it's current talloc_free restrictions]
>
> *The true talloc tip...*
>
> Before I explain, some points must be acknowledged:
>
>   1. There exists much code that uses talloc_steal/talloc_free
>      ambiguously.
>   2. This code needs fixing anyway.
>   3. It is being fixed with the aid of tridge's patch
>
> Before I explain, these points must be understood:
>
>   1. talloc_reference, talloc_reparent, talloc_unlink are pure and holy
>      and have zero ambiguity
>   2. talloc_steal and talloc_free are said to be ambiguous (herein lies
>      the entire problem).
>
> These are the features of the solution:
>
>   1. To make talloc_steal and talloc_free unambiguous
>   2. To preserve the utility of talloc
>   3. To preserve the API signatures and ABI
>   4. To avoid a soname bump
>
> To understand the problem these points must be understood:
>
>   1. talloc_steal and talloc_free are like talloc_reparent and
>      talloc_unlink except that they have an implicit "parent" argument
>   2. the current ambiguity comes from knowing which parent to use as
>      the implicit argument
>   3. the solution must remove ambiguity of the implicit argument
>
> The solution is to:
>
>    *make the implicit argument of talloc_steal and talloc_free be the
>    original allocating parent*
>
> if we do this, then there is absolutely no ambiguity [see point 3 below]
>
> Implications of this solution
>
>   1. The solution can remove ambiguity while preserving the API
>      signatures and ABI
>      Really really
>   2. It does NOT make buggy use of talloc_free suddenly become bug free
>      Faulty use of talloc_free and talloc_steal is will still be faulty
>   3. Just because talloc_steal and talloc_free are not ambigious
>      doesn't mean your program is not ambiguous.
>      Use of talloc/talloc_free is no more error-prone than use
>      malloc/free - which doesn't say a lot, but it is a benchmark.
>      If your code is too complicated for malloc/free then you should
>      not be using talloc/talloc_free but talloc/talloc_unlink.
>   4. If you don't want the original allocating parent to have any
>      special treatment then don't use talloc_free/talloc_steal
>      you can't use them now anyway unless you can swear that no-one
>      else is taking a reference to your object
>   5. Arguably we didn't preserve the ABI as we slightly changed the
>      meaning of talloc_steal/talloc_unfree
>      We did remove an ambiguity though, one which caused many uses of
>      talloc_steal/talloc_free to be broken
>
> And, (here I really do start to repeat myself) we get all of this merely by:
>
>   1. preventing the internal promotion of reference to parent
>   2. using a "no_owner" place holder as parent when talloc_free is
>      called while references are still held. This no_owner does not
>      have the power to prevent destruction when the references all go.
>      This no_owner can be stolen from.
>
> The patches for this, including patches to the test suite can be found in:
>   http://lists.samba.org/archive/samba-technical/2009-July/065483.html
>
> The specific "no_owner" patch is the third attachment. Much old code in
> that patch is commented out rather than removed because I was
> anticipating some discussion on the way I implemented it. [I shall
> remove the commented out code in my next rebase anyway as it "works for me"]
>
> I don't intend to make these points again, but as said before, the only
> person who showed any evidence of having read the solution didn't seem
> to think it made any difference.
>
> Sam
>

Re: the sorry saga of the talloc soname 'fix'

by Volker Lendecke :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Jul 08, 2009 at 09:15:52AM +0100, Sam Liddicott wrote:
> talloc_free is not safe to use, so we may as well get rid of it
> altogether and thus fully justify the .so version bump.

Wow. That's strong.

I'd rather not have to remember everywhere I use talloc
which the parent is. I'd much more prefer to go back to
talloc being the "hierarchical" memory allocator with a
strict tree. I don't see how without talloc_reference
talloc_free becomes ambiguous.

Volker


signature.asc (204 bytes) Download Attachment

Re: the sorry saga of the talloc soname 'fix'

by Michael Adam-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

ronnie sahlberg wrote:

> > Before I explain, some points must be acknowledged:
> >
> >   1. There exists much code that uses talloc_steal/talloc_free
> >      ambiguously.
> >   2. This code needs fixing anyway.
> ...
> > These are the features of the solution:
> >
> >   1. To make talloc_steal and talloc_free unambiguous
>
> There is nothing inherently ambigous with talloc_steal and talloc_free.
> I use them a lot in CTDB and there is aboslutely no need to fix
> anything regarding these functions or their use there.
>
> These functions are NOT ambiguous by themself. They are only ambiguous
> when they are used together with _reference/unreference. Never else.
Precisely!

The remedy for all our problems is to just get rid of
talloc_reference/talloc_unreference, the source of all
evil and grief.

Frankly, I personally can't imagine why one would use them at all...

Cheers - Michael


> ronnie sahlberg
>
>
>
> On Wed, Jul 8, 2009 at 6:15 PM, Sam Liddicott<sam@...> wrote:
> > Although the soname debate is important (and how we handle soname
> > bumping and ABI changes), I want to draw some attention back to the
> > talloc fix.
> >
> > I believe these comments will also provide some relief for the immediate
> > soname problem.
> >
> > * Andrew Bartlett wrote, On 07/07/09 08:10:
> >>
> >> I think Volker is right, that given we can easily (for some definition
> >> of easily) avoid changing the ABI, then it's an easier way out of this
> >> mess.  Had you proposed a more radical solution, there would perhaps
> >> have been less resistance :-)
> >>
> >> But I wonder if we really have preserved the ABI.  Given that
> >> talloc_free() and talloc_steal() now does a different thing, won't it
> >> now allow a use-after-free?  This would be an ABI change (even if not a
> >> signature change), and justify the change.
> >>
> >> This is more than just a cleanup of possible memory leaks, isn't it?
> >>
> >
> > Andrew; you make a good point.
> >
> > This change only means that talloc_free merely has a different conflict
> > with talloc_reference.
> >
> > Before this change talloc_reference/talloc_unreference had to be avoided
> > for fear of runtime talloc errors.
> >
> > After this change, talloc_free must be avoided for the same fear.
> >
> > I don't know about others views, but I don't think this is any kind of fix.
> >
> > talloc_free is not safe to use, so we may as well get rid of it
> > altogether and thus fully justify the .so version bump.
> >
> >
> > However I have a suggestion, which if followed will avoid this problem,
> > and the immediate issues of this "sorry soname saga".
> >
> > I really really think it is the solution to the current drawn out debate.
> >
> > Really really.
> >
> > As no-one has /shown/ any evidence of understanding what I've said
> > before or of how it helps or even if it makes a difference, I'm offering
> > a solution here. (The only objection I ever had to it was from tridge
> > who felt that it was no different from what we had before).
> >
> > Please make the effort to understand what I suggest here, and then
> > please make a judgement on whether or not this true talloc tip will
> > solve the whole problem including the sorry soname saga in this instance.
> >
> > [From my own point of view I'm going to let this go now, but it concerns
> > me to see this needless long debate, and a new talloc which I cannot
> > ship with it's current talloc_free restrictions]
> >
> > *The true talloc tip...*
> >
> > Before I explain, some points must be acknowledged:
> >
> >   1. There exists much code that uses talloc_steal/talloc_free
> >      ambiguously.
> >   2. This code needs fixing anyway.
> >   3. It is being fixed with the aid of tridge's patch
> >
> > Before I explain, these points must be understood:
> >
> >   1. talloc_reference, talloc_reparent, talloc_unlink are pure and holy
> >      and have zero ambiguity
> >   2. talloc_steal and talloc_free are said to be ambiguous (herein lies
> >      the entire problem).
> >
> > These are the features of the solution:
> >
> >   1. To make talloc_steal and talloc_free unambiguous
> >   2. To preserve the utility of talloc
> >   3. To preserve the API signatures and ABI
> >   4. To avoid a soname bump
> >
> > To understand the problem these points must be understood:
> >
> >   1. talloc_steal and talloc_free are like talloc_reparent and
> >      talloc_unlink except that they have an implicit "parent" argument
> >   2. the current ambiguity comes from knowing which parent to use as
> >      the implicit argument
> >   3. the solution must remove ambiguity of the implicit argument
> >
> > The solution is to:
> >
> >    *make the implicit argument of talloc_steal and talloc_free be the
> >    original allocating parent*
> >
> > if we do this, then there is absolutely no ambiguity [see point 3 below]
> >
> > Implications of this solution
> >
> >   1. The solution can remove ambiguity while preserving the API
> >      signatures and ABI
> >      Really really
> >   2. It does NOT make buggy use of talloc_free suddenly become bug free
> >      Faulty use of talloc_free and talloc_steal is will still be faulty
> >   3. Just because talloc_steal and talloc_free are not ambigious
> >      doesn't mean your program is not ambiguous.
> >      Use of talloc/talloc_free is no more error-prone than use
> >      malloc/free - which doesn't say a lot, but it is a benchmark.
> >      If your code is too complicated for malloc/free then you should
> >      not be using talloc/talloc_free but talloc/talloc_unlink.
> >   4. If you don't want the original allocating parent to have any
> >      special treatment then don't use talloc_free/talloc_steal
> >      you can't use them now anyway unless you can swear that no-one
> >      else is taking a reference to your object
> >   5. Arguably we didn't preserve the ABI as we slightly changed the
> >      meaning of talloc_steal/talloc_unfree
> >      We did remove an ambiguity though, one which caused many uses of
> >      talloc_steal/talloc_free to be broken
> >
> > And, (here I really do start to repeat myself) we get all of this merely by:
> >
> >   1. preventing the internal promotion of reference to parent
> >   2. using a "no_owner" place holder as parent when talloc_free is
> >      called while references are still held. This no_owner does not
> >      have the power to prevent destruction when the references all go.
> >      This no_owner can be stolen from.
> >
> > The patches for this, including patches to the test suite can be found in:
> >   http://lists.samba.org/archive/samba-technical/2009-July/065483.html
> >
> > The specific "no_owner" patch is the third attachment. Much old code in
> > that patch is commented out rather than removed because I was
> > anticipating some discussion on the way I implemented it. [I shall
> > remove the commented out code in my next rebase anyway as it "works for me"]
> >
> > I don't intend to make these points again, but as said before, the only
> > person who showed any evidence of having read the solution didn't seem
> > to think it made any difference.
> >
> > Sam


attachment0 (214 bytes) Download Attachment

Re: the sorry saga of the talloc soname 'fix'

by Sam Liddicott :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

* ronnie sahlberg wrote, On 08/07/09 09:27:

>> Before I explain, some points must be acknowledged:
>>
>>   1. There exists much code that uses talloc_steal/talloc_free
>>      ambiguously.
>>   2. This code needs fixing anyway.
>>    
> ...
>  
>> These are the features of the solution:
>>
>>   1. To make talloc_steal and talloc_free unambiguous
>>    
>
> There is nothing inherently ambigous with talloc_steal and talloc_free.
> I use them a lot in CTDB and there is aboslutely no need to fix
> anything regarding these functions or their use there.
>  
That's OK, cos I wasn't fixing anything regarding their use there, and
the fix I suggested won't interfere in the slightest with the use there.
> These functions are NOT ambiguous by themself. They are only ambiguous
> when they are used together with _reference/unreference. Never else.
>  
The bit when you said "they are only ambiguous when...." - that's what
we're talking about. I'm glad that you recognize where the ambiguity exists.

I spent a good amount of time trying to put together a post where people
would consider the suggestion and not get side-tracked.

I find it funny when you effectively say "ah, but I can think of a use
when they are not ambiguous" on a thread where we are talking about the
use when they are ambiguous.

Now that we understood that when I call them ambiguous I'm am indeed
talking about the case where you recognize them as being ambiguous, I
would take it as a personal favour if you would read the suggestion I
made and post your judgement in response to that same post (not this one).

Sam


Re: the sorry saga of the talloc soname 'fix'

by Sam Liddicott :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

* Michael Adam wrote, On 08/07/09 09:42:

> ronnie sahlberg wrote:
>  
>>> Before I explain, some points must be acknowledged:
>>>
>>>   1. There exists much code that uses talloc_steal/talloc_free
>>>      ambiguously.
>>>   2. This code needs fixing anyway.
>>>      
>> ...
>>    
>>> These are the features of the solution:
>>>
>>>   1. To make talloc_steal and talloc_free unambiguous
>>>      
>> There is nothing inherently ambigous with talloc_steal and talloc_free.
>> I use them a lot in CTDB and there is aboslutely no need to fix
>> anything regarding these functions or their use there.
>>
>> These functions are NOT ambiguous by themself. They are only ambiguous
>> when they are used together with _reference/unreference. Never else.
>>    
>
> Precisely!
>
> The remedy for all our problems is to just get rid of
> talloc_reference/talloc_unreference, the source of all
> evil and grief.
>
> Frankly, I personally can't imagine why one would use them at all...
>  
Excellent a joke!

(for the humour impaired, talloc_reference was added for a reason. I
make heavy use of it, hence I'm the one providing the fix to unite
talloc_reference with talloc_free).

Sam

Re: the sorry saga of the talloc soname 'fix'

by Sam Liddicott :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

* Volker Lendecke wrote, On 08/07/09 09:40:

> On Wed, Jul 08, 2009 at 09:15:52AM +0100, Sam Liddicott wrote:
>  
>> talloc_free is not safe to use, so we may as well get rid of it
>> altogether and thus fully justify the .so version bump.
>>    
>
> Wow. That's strong.
>
> I'd rather not have to remember everywhere I use talloc
> which the parent is.
You only need to remember who the parent is if you want to explicitly free.
> I'd much more prefer to go back to
> talloc being the "hierarchical" memory allocator with a
> strict tree.
Good. I posted a suggestion on how to do it, but no-one will read it.
> I don't see how without talloc_reference
> talloc_free becomes ambiguous.
>  
It doesn't.
It becomes ambiguous through lack of specification on how they should
co-exist.

Sam


Re: the sorry saga of the talloc soname 'fix'

by Michael Adam-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Sam Liddicott wrote:

> * Michael Adam wrote, On 08/07/09 09:42:
> >
> > The remedy for all our problems is to just get rid of
> > talloc_reference/talloc_unreference, the source of all
> > evil and grief.
> >
> > Frankly, I personally can't imagine why one would use them at all...
> >  
> Excellent a joke!
>
> (for the humour impaired, talloc_reference was added for a reason. I
> make heavy use of it, hence I'm the one providing the fix to unite
> talloc_reference with talloc_free).
In fact, I am not joking. :-)

talloc_(un)reference turns the neat, simple tree of memory
allocations into a graph with cycles and whatnot. This is
the origin of the problems we are currently discussing.

And I really think that this additional complexity is avoidable.
Maybe this is due to the fact that I have up to now mainly worked
on the samba3 codebase where there is virtually no use of
talloc_reference, as opposed to samba4 which makes rather heavy
use of talloc_reference. I myself have never used it and I have
never felt the need to do so.

And given the problems that this creates, I really cannot see
why one would not try to avoid using talloc_reference by all
means (only because of sheer laziness.. ;-).

Sam, please excuse my ignorance, can you give me an example of a
situation that you could not have solved without talloc_reference?

Cheers - Michael



attachment0 (214 bytes) Download Attachment

Re: the sorry saga of the talloc soname 'fix'

by Sam Liddicott :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

* Michael Adam wrote, On 08/07/09 10:17:

> Sam Liddicott wrote:
>  
>> * Michael Adam wrote, On 08/07/09 09:42:
>>    
>>> The remedy for all our problems is to just get rid of
>>> talloc_reference/talloc_unreference, the source of all
>>> evil and grief.
>>>
>>> Frankly, I personally can't imagine why one would use them at all...
>>>  
>>>      
>> Excellent a joke!
>>
>> (for the humour impaired, talloc_reference was added for a reason. I
>> make heavy use of it, hence I'm the one providing the fix to unite
>> talloc_reference with talloc_free).
>>    
>
> In fact, I am not joking. :-)
>
> talloc_(un)reference turns the neat, simple tree of memory
> allocations into a graph with cycles and whatnot. This is
> the origin of the problems we are currently discussing.
>
> And I really think that this additional complexity is avoidable.
>  
The additional complexity can be managed quite simply (won't someone
please ready my solution instead of just arguing about whether or not
there is a real problem). We can even solve the cycles problem, I have
talked about it, but if no-one will read this solution there's no point
in talking about the solution to the cycles.

> Maybe this is due to the fact that I have up to now mainly worked
> on the samba3 codebase where there is virtually no use of
> talloc_reference, as opposed to samba4 which makes rather heavy
> use of talloc_reference. I myself have never used it and I have
> never felt the need to do so.
>
> And given the problems that this creates, I really cannot see
> why one would not try to avoid using talloc_reference by all
> means (only because of sheer laziness.. ;-).
>  

Of course you can't see why, because as you say you don't use it.

> Sam, please excuse my ignorance, can you give me an example of a
> situation that you could not have solved without talloc_reference?
>  
Isn't the fact that I've debugged the problem, the fact that I've
written the patches AND the test cases AND tried for around 8 months to
have someone pay attention to this enough for you to take me seriously?

Now, rather than have the fix reviewed, I'm to have a critical review of
my code by people who don't use or need to use talloc_reference to see
if I could possibly have worked around this feature that the talloc API
offers?

Why not just look at the fix? I'm using it, I have the benefitd, I'm
trying one more time to give it to Samba because could ALSO solves the
immediate soname bumping issue.

[I predict that someone will now talk about this rather than about the
actual value of the suggested fix]

Sam

Re: the sorry saga of the talloc soname 'fix'

by Stefan (metze) Metzmacher :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Tridge,

> So to me it should not go in the core code. If you think it needs to
> be in a common place so that deb and RPM based distros coordinate,
> then put it in a packaging/ subdirectory.
>
> In all of this please also remember just how much of the Samba
> codebase is now in shared libs. The librpc code alone is over 500k
> lines of code, including the generated code. That lib is used by
> external projects, and may be called both directly and indirectly by
> apps, just like talloc is.

But librpc doesn't have a stable marked API yet.

> Whatever method we establish for handling shared libs now is not just
> for talloc. It applies to a very large proportion of the projects code
> base. I do not want that entire code base constrained by external
> shared lib concerns. I do not want the maintenance of the shared lib
> ABI to become the responsibility of every contributor to Samba.

I think talloc, tdb (and later tevent and ldb) are special
and we should treat them more as standalone libraries with a fixed API.
And there standalone build systems are independent from samba4.

The samba4 rpc and smb client libraries are really not stable yet,
but that's know to the callers openchange always needs to be build
against a fixed snapshot.

BTW: the problem with static and shared linking together is a completely
different thing and should be ignored in this discussion.

As I said before I've done some testing with compat libraries.
I created some very simple examples, which show how two versions of a
(liba.so.1 and liba.so.2) can be coexists within one process.

The trick is that the liba.so.1 compat library is build on top
of the public API of liba.so.2 and also links to it. That means the
compat library only provides symbols which are not in the new library
and it also do not know about private data of liba.so.2.

I tried some combinations which can happen:

libb.so.1 was build against liba.so.1 and is happy with the old
liba.so.1, but also with the compat version of liba.so.1 at runtime.

appa1 was build against liba.so.1 and is happy with the old liba.so.1,
but also with the compat version of liba.so.1 at runtime.

appa2 was build against liba.so.2 and is only happy with liba.so.2.

appa2b1 was build against liba.so.2 and libb.so.1 and is happy when at
runtime the compat version of liba.so.1 is implicit loaded via libb.so.1.

I think such a compat library could be an alternative to symbol
versioning and could be located in lib/talloc/compat/.

metze

The output of the runme.sh follows:

# make -C liba1 clean
make: Gehe in Verzeichnis '/tmp/junkcode/shared-libs/liba1'
rm -f *.o *.so*
make: Verlasse Verzeichnis '/tmp/junkcode/shared-libs/liba1'
# make -C liba1
make: Gehe in Verzeichnis '/tmp/junkcode/shared-libs/liba1'
ccache gcc -fPIC   -c -o liba.o liba.c
gcc -shared -o liba.so.1 -Wl,-soname=liba.so.1 liba.o
rm -f liba.so
ln -s liba.so.1 liba.so
make: Verlasse Verzeichnis '/tmp/junkcode/shared-libs/liba1'
# make -C liba2 clean
make: Gehe in Verzeichnis '/tmp/junkcode/shared-libs/liba2'
rm -f *.o *.so*
make: Verlasse Verzeichnis '/tmp/junkcode/shared-libs/liba2'
# make -C liba2
make: Gehe in Verzeichnis '/tmp/junkcode/shared-libs/liba2'
ccache gcc -fPIC   -c -o liba.o liba.c
gcc -shared -o liba.so.2 -Wl,-soname=liba.so.2 liba.o
rm -f liba.so
ln -s liba.so.2 liba.so
make: Verlasse Verzeichnis '/tmp/junkcode/shared-libs/liba2'
# make -C liba2-compat1 clean
make: Gehe in Verzeichnis '/tmp/junkcode/shared-libs/liba2-compat1'
rm -f *.o *.so*
make: Verlasse Verzeichnis '/tmp/junkcode/shared-libs/liba2-compat1'
# make -C liba2-compat1
make: Gehe in Verzeichnis '/tmp/junkcode/shared-libs/liba2-compat1'
ccache gcc -fPIC   -c -o liba2-compat1.o liba2-compat1.c
gcc -shared -o liba2-compat1.so -Wl,-soname=liba.so.1 liba2-compat1.o
../liba2/liba.so
rm -f liba.so.1
ln -s liba2-compat1.so liba.so.1
make: Verlasse Verzeichnis '/tmp/junkcode/shared-libs/liba2-compat1'
# make -C libb1 clean
make: Gehe in Verzeichnis '/tmp/junkcode/shared-libs/libb1'
rm -f *.o *.so*
make: Verlasse Verzeichnis '/tmp/junkcode/shared-libs/libb1'
# make -C libb1
make: Gehe in Verzeichnis '/tmp/junkcode/shared-libs/libb1'
ccache gcc -fPIC   -c -o libb.o libb.c
gcc -shared -o libb.so.1 -Wl,-soname=libb.so.1 libb.o ../liba1/liba.so
rm -f libb.so
ln -s libb.so.1 libb.so
make: Verlasse Verzeichnis '/tmp/junkcode/shared-libs/libb1'
# make -C appa1 clean
make: Gehe in Verzeichnis '/tmp/junkcode/shared-libs/appa1'
rm -f *.o *.so* appa1
make: Verlasse Verzeichnis '/tmp/junkcode/shared-libs/appa1'
# make -C appa1
make: Gehe in Verzeichnis '/tmp/junkcode/shared-libs/appa1'
ccache gcc    -c -o app.o app.c
gcc -o appa1 app.o ../liba1/liba.so
make: Verlasse Verzeichnis '/tmp/junkcode/shared-libs/appa1'
# make -C appa2 clean
make: Gehe in Verzeichnis '/tmp/junkcode/shared-libs/appa2'
rm -f *.o *.so* appa2
make: Verlasse Verzeichnis '/tmp/junkcode/shared-libs/appa2'
# make -C appa2
make: Gehe in Verzeichnis '/tmp/junkcode/shared-libs/appa2'
ccache gcc    -c -o app.o app.c
gcc -o appa2 app.o ../liba2/liba.so
make: Verlasse Verzeichnis '/tmp/junkcode/shared-libs/appa2'
# make -C appa2b1 clean
make: Gehe in Verzeichnis '/tmp/junkcode/shared-libs/appa2b1'
rm -f *.o *.so* appa2b1
make: Verlasse Verzeichnis '/tmp/junkcode/shared-libs/appa2b1'
# make -C appa2b1
make: Gehe in Verzeichnis '/tmp/junkcode/shared-libs/appa2b1'
ccache gcc    -c -o app.o app.c
gcc -o appa2b1 app.o ../liba2/liba.so ../libb1/libb.so
-Wl,-rpath-link,../liba1/
/usr/bin/ld: warning: liba.so.1, needed by ../libb1/libb.so, may
conflict with liba.so.2
make: Verlasse Verzeichnis '/tmp/junkcode/shared-libs/appa2b1'

# LD_LIBRARY_PATH=liba1 appa1/appa1
appa1: ...
liba1: libafunc1(1) = 0
appa1: ... 0

# LD_LIBRARY_PATH=liba1 ldd appa1/appa1
        linux-vdso.so.1 =>  (0x00007fff71ffe000)
        liba.so.1 => liba1/liba.so.1 (0x00007f2669ab8000)
        libc.so.6 => /lib/libc.so.6 (0x00007f2669756000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f2669cb9000)

# LD_LIBRARY_PATH=liba1 appa1/appa1
appa1: ...
liba1: libafunc1(1) = 0
appa1: ... 0

# LD_LIBRARY_PATH=liba2-compat1:liba2 ldd appa1/appa1
        linux-vdso.so.1 =>  (0x00007fff4d9fe000)
        liba.so.1 => liba2-compat1/liba.so.1 (0x00007f8a454a4000)
        libc.so.6 => /lib/libc.so.6 (0x00007f8a45142000)
        liba.so.2 => liba2/liba.so.2 (0x00007f8a44f41000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f8a456a5000)

# LD_LIBRARY_PATH=liba2-compat1:liba2 appa1/appa1
appa1: ...
liba2-compat1: libafunc1(1) ...
liba2: _libafunc1(1, liba2-compat1-location) = -1
liba2-compat1: ... -1 => 0
appa1: ... 0

# LD_LIBRARY_PATH=liba2 ldd appa2/appa2
        linux-vdso.so.1 =>  (0x00007fff009fe000)
        liba.so.2 => liba2/liba.so.2 (0x00007f09f8570000)
        libc.so.6 => /lib/libc.so.6 (0x00007f09f820e000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f09f8771000)

# LD_LIBRARY_PATH=liba2 appa2/appa2
appa2: ...
liba2: _libafunc1(2, app.c:10) = -1
appa2: ... -1
appa2: ...
liba2: libafunc2(2) = 0
appa2: ... 0

# LD_LIBRARY_PATH=liba2:libb1:liba2-compat1 ldd appa2b1/appa2b1
        linux-vdso.so.1 =>  (0x00007fff725fe000)
        liba.so.2 => liba2/liba.so.2 (0x00007fa46a096000)
        libb.so.1 => libb1/libb.so.1 (0x00007fa469e95000)
        libc.so.6 => /lib/libc.so.6 (0x00007fa469b33000)
        liba.so.1 => liba2-compat1/liba.so.1 (0x00007fa469932000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fa46a297000)

# LD_LIBRARY_PATH=liba2:libb1:liba2-compat1 appa2b1/appa2b1
appa2: ...
liba2: _libafunc1(21, app.c:11) = -1
appa2: ... -1
appa2: ...
liba2: libafunc2(21) = 0
appa2: ... 0
appa2: ...
libb1: libbfunc1(21) ...
liba2-compat1: libafunc1(21) ...
liba2: _libafunc1(21, liba2-compat1-location) = -1
liba2-compat1: ... -1 => 0
libb1: ... 0
appa2: ... 0




signature.asc (260 bytes) Download Attachment

Re: the sorry saga of the talloc soname 'fix'

by Michael Brown-7 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wednesday 08 July 2009 09:15:52 Sam Liddicott wrote:

> <snip>
>
> And, (here I really do start to repeat myself) we get all of this merely
> by:
>
>    1. preventing the internal promotion of reference to parent
>    2. using a "no_owner" place holder as parent when talloc_free is
>       called while references are still held. This no_owner does not
>       have the power to prevent destruction when the references all go.
>       This no_owner can be stolen from.
>
> The patches for this, including patches to the test suite can be found in:
>    http://lists.samba.org/archive/samba-technical/2009-July/065483.html

Score: +5, Actually solves problem.

I've been following this thread with interest, discovering a world of shared
library pain that I never knew existed, all of which seems tangential to the
real solution of making the API unambiguous.  I wouldn't normally jump into a
discussion on a project I don't actively contribute to, but it would be very
sad to see such an elegant fix drown in the noise.

Michael

Re: the sorry saga of the talloc soname 'fix'

by Jeremy Allison :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Jul 08, 2009 at 06:03:58PM +0200, Stefan (metze) Metzmacher wrote:

> I think talloc, tdb (and later tevent and ldb) are special
> and we should treat them more as standalone libraries with a fixed API.
> And there standalone build systems are independent from samba4.

+1. This is precisely the point I was trying to make.
talloc and tdb are external libraries from Samba now, and
the rules change (IMHO). I think trying to preserve a stable
ABI in this case is very important.

Jeremy

Re: the sorry saga of the talloc soname 'fix'

by Jelmer Vernooij :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

tridge@... wrote:

> Hi Jelmer,
>
>  > FWIW, these issues were fixed in Debian experimental a couple of
>  > months ago, so I would expect Karmic to ship with properly working
>  > packages.
>
> excellent!
>
> If we can get the updated talloc API in too then we'll be in much
> better shape than we are now.
>  
Yeah, there should be enough time for that if such a change was deemed
necessary upstream (sorry for being deliberately vague but I haven't
read all of the huge thread yet).
> One thing I wanted to ask you. If we add a talloc_set_log_fn() call to
> replace the stderr hack I have in there now, then how do I ensure this
> is called from all Samba code? Can we associate an init function with
> lib/util/debug.c ?
>  
We could probably call this function as part of the initialization of
the debug system.

Cheers,

Jelmer

Re: the sorry saga of the talloc soname 'fix'

by Michael Adam-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Sam Liddicott wrote:

> * Michael Adam wrote, On 08/07/09 10:17:
> > Sam Liddicott wrote:
> >  
> >> * Michael Adam wrote, On 08/07/09 09:42:
> >>    
> >>> The remedy for all our problems is to just get rid of
> >>> talloc_reference/talloc_unreference, the source of all
> >>> evil and grief.
> >>>
> >>> Frankly, I personally can't imagine why one would use them at all...
> >>>  
> >>>      
> >> Excellent a joke!
> >
> > In fact, I am not joking. :-)
> >
> > talloc_(un)reference turns the neat, simple tree of memory
> > allocations into a graph with cycles and whatnot. This is
> > the origin of the problems we are currently discussing.
> >
> > And I really think that this additional complexity is avoidable.
> >  
> The additional complexity can be managed quite simply (won't someone
> please ready my solution instead of just arguing about whether or not
> there is a real problem). We can even solve the cycles problem, I have
> talked about it, but if no-one will read this solution there's no point
> in talking about the solution to the cycles.
> > Maybe this is due to the fact that I have up to now mainly worked
> > on the samba3 codebase where there is virtually no use of
> > talloc_reference, as opposed to samba4 which makes rather heavy
> > use of talloc_reference. I myself have never used it and I have
> > never felt the need to do so.
> >
> > And given the problems that this creates, I really cannot see
> > why one would not try to avoid using talloc_reference by all
> > means (only because of sheer laziness.. ;-).
> >  
>
> Of course you can't see why, because as you say you don't use it.
>
> > Sam, please excuse my ignorance, can you give me an example of a
> > situation that you could not have solved without talloc_reference?
> >  
> Isn't the fact that I've debugged the problem, the fact that I've
> written the patches AND the test cases AND tried for around 8 months to
> have someone pay attention to this enough for you to take me seriously?
Sam, I am taking you 100% seriously!

But because I have not really been concerned with the
talloc_reference problem yet, I had not tightly followed the
discussion earlier.

> Now, rather than have the fix reviewed, I'm to have a critical review of
> my code by people who don't use or need to use talloc_reference to see
> if I could possibly have worked around this feature that the talloc API
> offers?

Hey, I am not questioning _your_ use of talloc_reference
especially, but any use.

> Why not just look at the fix? I'm using it, I have the benefitd, I'm
> trying one more time to give it to Samba because could ALSO solves the
> immediate soname bumping issue.

I am aware of the fact that you have raised the problem some time
ago and also proposed a solution and provided patches. I am also
aware of the fact that Tridge basically turned your proposal down
or at least the discussion stopped rather suddenly.

> [I predict that someone will now talk about this rather than about the
> actual value of the suggested fix]

I still think that the value of having reference counting in
talloc is questionable.

But when we want talloc to offer reference counting, it should
not put extra burden onto the users (programmers). This means
that the user should not need to carry any extra knowledge about
the inner workins around and should not need to check any other
potential users of a given allocation context in order to
determine which kind of free should be used. My naive idea is
that it should always be safe to use talloc_free on a context,
no matter if it is an originally allocated context or a
reference. The last free-er really frees.

I think this is exactly the opposit to your solution.
This is what I would wish for quite naively from a caller's
perspective. It might well be that I am ignoring intrinsic
problems here. I have to look at the previous discussions,
at the talloc code and at your patches more thoroughly to
give a more detailed and possibly more qualified answer.
(And I will...)

But this is as much as I can say now.

Cheers - Michael



attachment0 (214 bytes) Download Attachment

Re: the sorry saga of the talloc soname 'fix'

by Kai Blin-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wednesday 08 July 2009 11:30:56 Sam Liddicott wrote:

> > talloc_(un)reference turns the neat, simple tree of memory
> > allocations into a graph with cycles and whatnot. This is
> > the origin of the problems we are currently discussing.
> >
> > And I really think that this additional complexity is avoidable.
>
> The additional complexity can be managed quite simply (won't someone
> please ready my solution instead of just arguing about whether or not
> there is a real problem). We can even solve the cycles problem, I have
> talked about it, but if no-one will read this solution there's no point
> in talking about the solution to the cycles.
I have to admit that I don't understand your proposed solution. I have only
been skimming this (and related) discussions as I've got more important
things on my agenda, but to quote from your recent post on the subject:

-----8<------
Before I explain, some points must be acknowledged:

   1. There exists much code that uses talloc_steal/talloc_free
      ambiguously.
   2. This code needs fixing anyway.
   3. It is being fixed with the aid of tridge's patch
-----8<------

I'm happy to acknowledge this, but it is my understanding that the ambiguity
comes from mixing talloc_reference and the two "culprits".

-----8<------
Before I explain, these points must be understood:

   1. talloc_reference, talloc_reparent, talloc_unlink are pure and holy
      and have zero ambiguity
   2. talloc_steal and talloc_free are said to be ambiguous (herein lies
      the entire problem).
-----8<------

I read this, but as mentioned above I'm not quite sure I understand. This was
the point where I decided that given I don't fulfill the invariants you set
for reading the explanation, reading it is not really that useful.

[...]

> Of course you can't see why, because as you say you don't use it.
>
> > Sam, please excuse my ignorance, can you give me an example of a
> > situation that you could not have solved without talloc_reference?
>
> Isn't the fact that I've debugged the problem, the fact that I've
> written the patches AND the test cases AND tried for around 8 months to
> have someone pay attention to this enough for you to take me seriously?

Sorry, but that's not helpful. I was wondering about the same thing as Michael
here, and I could even claim I'm a Samba4 developer. I've never run into a
problem where the obvious solution to my problem was using references.
Also, "Trust me, I know what I'm talking about" is not very convincing. I'd
trust the senior devs to know what they're talking about, but recent evidence
shows that the trouble is that everybody seems to be talking about different,
contradicting things. An example would help to clarify for people like me,
who don't even fully see what is being talked about.

I've looked at the patches you posted, and none of the test cases answers
Michael's question about a use case for talloc_reference. So for me all that
the fact that you've written the patches AND test cases AND tried for 8
months to have someone pay attention to this enough tells me is that you
really care about talloc_reference semantics. From the tone of your email I
also seem to be some kind of idiot to not care about it.

> Now, rather than have the fix reviewed, I'm to have a critical review of
> my code by people who don't use or need to use talloc_reference to see
> if I could possibly have worked around this feature that the talloc API
> offers?

Not at all. We're just curious what this would be used for. If it's so
complicated that only reading all of your code can illustrate it, then I've
got to admit that I don't care enough. As I've mentioned before, I've got
other things to work on as well. If your use case is very complicated and
there is no easy example, please stop being condescending to people who don't
immediately understand what you're talking about.

Given that I've happily used talloc without ever having to learn about
talloc_reference tells me that it's probably a pretty advanced feature, which
just judging from the constant discussions on using it is very hard to use
correctly. It now seems to me that in order to understand why the calls to
talloc_steal and talloc_free I've used so far are likely to be wrong, because
talloc_steal and talloc_free are bad. However, it seems impossible for me to
tell if my specific use of those calls is bad because I need to wrap my head
around a more complicated model of talloc first. Personally I think if you
need to spend two weeks to learn talloc before being able to start working on
Samba tells me the memory allocation library is too complicated.


> [I predict that someone will now talk about this rather than about the
> actual value of the suggested fix]

There, now I'm annoyed in addition to confused because you've made it
impossible to respond to email without looking even more silly.
Kai
--
Kai Blin
WorldForge developer  http://www.worldforge.org/
Wine developer        http://wiki.winehq.org/KaiBlin
Samba team member     http://www.samba.org/samba/team/
--
Will code for cotton.


signature.asc (196 bytes) Download Attachment

Re: the sorry saga of the talloc soname 'fix'

by Sam Liddicott :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

* Michael Adam wrote, On 09/07/09 10:37:
>
> Hey, I am not questioning _your_ use of talloc_reference
> especially, but any use.
>
>  

It's an interesting discussion, but not one I attempt to address.
I'm just offering a fix to existing talloc functionality.

>> Why not just look at the fix? I'm using it, I have the benefitd, I'm
>> trying one more time to give it to Samba because could ALSO solves the
>> immediate soname bumping issue.
>>    
>
> I am aware of the fact that you have raised the problem some time
> ago and also proposed a solution and provided patches. I am also
> aware of the fact that Tridge basically turned your proposal down
> or at least the discussion stopped rather suddenly.
>  

Yes. Tridge said he couldn't see that it made any difference because he
had one test case where externally it appeared to behave the same as
behaviour without the patch.

>  
>> [I predict that someone will now talk about this rather than about the
>> actual value of the suggested fix]
>>    
>
> I still think that the value of having reference counting in
> talloc is questionable.
>
> But when we want talloc to offer reference counting, it should
> not put extra burden onto the users (programmers). This means
> that the user should not need to carry any extra knowledge about
> the inner workins around and should not need to check any other
> potential users of a given allocation context in order to
> determine which kind of free should be used.

Precisely, and my patch makes it more like this than before.

However talloc offers reference tracking and reference counting.

If you just want reference counting, then talloc_reference(NULL, object)
and talloc_unlink(NULL, object) will add and remove a reference "count".

If you want to use reference tracking then you have to remove the
specific reference that you added.

> My naive idea is
> that it should always be safe to use talloc_free on a context,
> no matter if it is an originally allocated context or a
> reference.
precisely. It is not so with or without tridges patch.

But "safe" doesn't mean it works. If two tracked references have been
made against an object, you can call talloc_free all you like and it
won't make any difference - if it did it would leave dangling pointers
to whoever took the reference. The actual free won't occur until the
references go away.

If you want to be able to keep calling free and have the object go away
while referenced, then they would have to be weak references which I
posted an implementation of recently in responses to one of Ronnie's posts.

> The last free-er really frees.
>
>  

precisely, except talloc_free is no longer the only mechanism to free.

In the olden days "talloc_free" meant "free". Now it means "free when
the other references have all gone away because we don't want dangling
pointers" (the point of references is to prevent dangling pointers to
free memory)

> I think this is exactly the opposit to your solution.
>  

I think it is pretty much what the talloc API promises and what my
solution fixes

> This is what I would wish for quite naively from a caller's
> perspective. It might well be that I am ignoring intrinsic
> problems here. I have to look at the previous discussions,
> at the talloc code and at your patches more thoroughly to
> give a more detailed and possibly more qualified answer.
> (And I will...)
>
> But this is as much as I can say now.
>  

I appreciate your continued attention, busy as you are.

I also want a sane talloc from a users point of view.

Sam

Re: the sorry saga of the talloc soname 'fix'

by tridge@samba.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Sam,

My apologies for not having done a proper analysis of your proposal.

I have just tried to apply your patches to a test tree, but they fail
to apply, and the fixups do not look trivial. Could you send me a
clean copy of talloc.[ch] and testsuite.c for me to look at?

I do have some comments based on your description of the patch though,
although whether those comments are useful depends on my correct
interpretation of your text description of the changes. I always
prefer to work from real code, as descriptions are rarely sufficient,
especially with such a complex change.

ok, so some wild specualation based on your description:

 1) you propose that talloc_free() be allowed with a reference is
 present, but that it should always free the 'original' or 'real'
 parent. That is exactly the opposite behaviour of what talloc has
 done since talloc_reference() was added, so changing it would be
 rather a surprise to existing code. The existing behaviour, before my
 recent changes, was to remove the most recent reference. I am always
 reluctant to completely invert existing behaviour. To prevent
 ambiguous calls is one thing, especially when a clear warning is
 given, but to do the exact opposite of what has been documented seems
 to be asking for trouble. I certainly wouldn't call it a ABI (or
 API) compatible change.

 2) I think, from your description, that talloc_steal would change to
 become a function that sometimes changes the effective reference
 count and sometimes doesn't. Let's look at an example:

    int *p1 = talloc(NULL, int);
    int *p2 = talloc(NULL, int);
    int *p3 = talloc(NULL, int);
    int *c1 = talloc(p1, int);
    int *r1 = talloc_reference(p2, c1);

at this point the effective reference count of c1 is 2. It will take 2
free calls for it to go away.

Now we do this:

    talloc_free(r1);

and if I have understood your patches correctly (a big if, given I
can't run them!) then we end at this point with p1 having a reference
to c1, but c1 also having a new type of pseudo-parent.

At this point c1 has a effective reference count of 1, so a single
free will get rid of it.

Now we do this:

    talloc_steal(p3, r1);

and now, if I have understood your patches correctly, p3 becomes the
'real' parent, but the effective reference count of c1 has increased!
It will now take two free operations for it to go away. So in total we
will need 3 frees to get rid of this pointer if you use a steal, and
two frees if you don't use a steal.

So that means that talloc_steal() now will, sometimes, change the
reference count of a pointer. That is not a property it has had
before, and it is not a desirable property I think.

Have I interpreted the patches correctly, or should I wait until I can
actually run the code?

Cheers, Tridge

Re: the sorry saga of the talloc soname 'fix'

by Stefan (metze) Metzmacher :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Sam Liddicott schrieb:

> Although the soname debate is important (and how we handle soname
> bumping and ABI changes), I want to draw some attention back to the
> talloc fix.
>
> I believe these comments will also provide some relief for the immediate
> soname problem.
>
> * Andrew Bartlett wrote, On 07/07/09 08:10:
>> I think Volker is right, that given we can easily (for some definition
>> of easily) avoid changing the ABI, then it's an easier way out of this
>> mess.  Had you proposed a more radical solution, there would perhaps
>> have been less resistance :-)
>>
>> But I wonder if we really have preserved the ABI.  Given that
>> talloc_free() and talloc_steal() now does a different thing, won't it
>> now allow a use-after-free?  This would be an ABI change (even if not a
>> signature change), and justify the change.
>>
>> This is more than just a cleanup of possible memory leaks, isn't it?
>>  
>
> Andrew; you make a good point.
>
> This change only means that talloc_free merely has a different conflict
> with talloc_reference.
>
> Before this change talloc_reference/talloc_unreference had to be avoided
> for fear of runtime talloc errors.
>
> After this change, talloc_free must be avoided for the same fear.
>
> I don't know about others views, but I don't think this is any kind of fix.
>
> talloc_free is not safe to use, so we may as well get rid of it
> altogether and thus fully justify the .so version bump.
>
>
> However I have a suggestion, which if followed will avoid this problem,
> and the immediate issues of this "sorry soname saga".
>
> I really really think it is the solution to the current drawn out debate.
>
> Really really.
>
> As no-one has /shown/ any evidence of understanding what I've said
> before or of how it helps or even if it makes a difference, I'm offering
> a solution here. (The only objection I ever had to it was from tridge
> who felt that it was no different from what we had before).
I tried (the 2nd patch in your list was written by me).

> Please make the effort to understand what I suggest here, and then
> please make a judgement on whether or not this true talloc tip will
> solve the whole problem including the sorry soname saga in this instance.
>
> [From my own point of view I'm going to let this go now, but it concerns
> me to see this needless long debate, and a new talloc which I cannot
> ship with it's current talloc_free restrictions]
>
> *The true talloc tip...*
>
> Before I explain, some points must be acknowledged:
>
>    1. There exists much code that uses talloc_steal/talloc_free
>       ambiguously.
>    2. This code needs fixing anyway.
>    3. It is being fixed with the aid of tridge's patch
>
> Before I explain, these points must be understood:
>
>    1. talloc_reference, talloc_reparent, talloc_unlink are pure and holy
>       and have zero ambiguity
>    2. talloc_steal and talloc_free are said to be ambiguous (herein lies
>       the entire problem).
>
> These are the features of the solution:
>
>    1. To make talloc_steal and talloc_free unambiguous
>    2. To preserve the utility of talloc
>    3. To preserve the API signatures and ABI
>    4. To avoid a soname bump
>
> To understand the problem these points must be understood:
>
>    1. talloc_steal and talloc_free are like talloc_reparent and
>       talloc_unlink except that they have an implicit "parent" argument
>    2. the current ambiguity comes from knowing which parent to use as
>       the implicit argument
>    3. the solution must remove ambiguity of the implicit argument
>
> The solution is to:
>
>     *make the implicit argument of talloc_steal and talloc_free be the
>     original allocating parent*
>
> if we do this, then there is absolutely no ambiguity [see point 3 below]
>
> Implications of this solution
>
>    1. The solution can remove ambiguity while preserving the API
>       signatures and ABI
>       Really really
>    2. It does NOT make buggy use of talloc_free suddenly become bug free
>       Faulty use of talloc_free and talloc_steal is will still be faulty
>    3. Just because talloc_steal and talloc_free are not ambigious
>       doesn't mean your program is not ambiguous.
>       Use of talloc/talloc_free is no more error-prone than use
>       malloc/free - which doesn't say a lot, but it is a benchmark.
>       If your code is too complicated for malloc/free then you should
>       not be using talloc/talloc_free but talloc/talloc_unlink.
>    4. If you don't want the original allocating parent to have any
>       special treatment then don't use talloc_free/talloc_steal
>       you can't use them now anyway unless you can swear that no-one
>       else is taking a reference to your object
>    5. Arguably we didn't preserve the ABI as we slightly changed the
>       meaning of talloc_steal/talloc_unfree
>       We did remove an ambiguity though, one which caused many uses of
>       talloc_steal/talloc_free to be broken
>
> And, (here I really do start to repeat myself) we get all of this merely by:
>
>    1. preventing the internal promotion of reference to parent
>    2. using a "no_owner" place holder as parent when talloc_free is
>       called while references are still held. This no_owner does not
>       have the power to prevent destruction when the references all go.
>       This no_owner can be stolen from.
>
> The patches for this, including patches to the test suite can be found in:
>    http://lists.samba.org/archive/samba-technical/2009-July/065483.html
>
> The specific "no_owner" patch is the third attachment. Much old code in
> that patch is commented out rather than removed because I was
> anticipating some discussion on the way I implemented it. [I shall
> remove the commented out code in my next rebase anyway as it "works for me"]
None of this 4 patches passes make test in the a standalone build,
so there's nothing that can be judged :-(

I've played with it here (cherry picked, reverted, cherry-picked
combinations of the 4 patches)
http://gitweb.samba.org/?p=metze/samba/wip.git;a=shortlog;h=refs/heads/master4-talloc-sam

It's based on a version of talloc before tridge change it see:
http://gitweb.samba.org/?p=metze/samba/wip.git;a=shortlog;h=refs/heads/master4-talloc-old

I would be really interested in a working patch
(without old and ugly formated comments in it).
Please provide the minimal patch to talloc.c to implement
your desired behavior and then maybe a patch to fix existing tests
(hopefully not needed) and a patch that adds new tests.

I need something that I can apply, test and review without
having to dig through unrelated changes, comments.
And it should actually work at least for the standalone build,
but also in the source4 and source3 'make test' cases.

metze



signature.asc (260 bytes) Download Attachment
< Prev | 1 - 2 - 3 - 4 | Next >