Re: [SCM] Samba Shared Repository - branch master updated - release-4-0-0alpha8-205-g7119241

View: New views
20 Messages — Rating Filter:   Alert me  
< Prev | 1 - 2 - 3 - 4 | Next >

Re: the sorry saga of the talloc soname 'fix'

by tridge@samba.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Jelmer,

 > FWIW, these issues were fixed in Debian experimental a couple of
 > months ago, so I would expect Karmic to ship with properly working
 > packages.

excellent!

If we can get the updated talloc API in too then we'll be in much
better shape than we are now.

One thing I wanted to ask you. If we add a talloc_set_log_fn() call to
replace the stderr hack I have in there now, then how do I ensure this
is called from all Samba code? Can we associate an init function with
lib/util/debug.c ?

Cheers, Tridge

Re: the sorry saga of the talloc soname 'fix'

by Volker Lendecke :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Mon, Jul 06, 2009 at 11:10:11PM +1000, tridge@... wrote:
>  > So if we skipped the "location" piece of the change, the
>  > error messages would get a lot less useful, and the only
>  > real change would be to create memory leaks in already
>  > broken code. Right?
>
> yep. I did think about doing that, but I really disliked the idea of
> creating a bunch of very difficult to track down leaks.

These leaks would come along with the not-so-easily-readable
error messages.

From my point of view this is a valid compromise: In the
shipped versions of talloc just warn that something is
wrong. The message should carry a warning that potentially a
leak was created. If a developer wants to reproduce this
[s]he is free to use a "checked build" version of the
application, compiled against the "location"-aware talloc
headers and .so.

This way we don't have to break anything and just increment
the minor version number due to talloc_reparent being added.

Volker


signature.asc (204 bytes) Download Attachment

Re: the sorry saga of the talloc soname 'fix'

by Christian Perrier :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Quoting Jelmer Vernooij (jelmer@...):

> FWIW, these issues were fixed in Debian experimental a couple of months
> ago, so I would expect Karmic to ship with properly working packages.


That however needs what we have in experimental to go to
unstable.....which, by chance is exactly what's planned very soon, as
I will upload 3.4.0 packages to Debian unstable very soon.
y





Re: the sorry saga of the talloc soname 'fix'

by tridge@samba.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Volker,

 > From my point of view this is a valid compromise: In the
 > shipped versions of talloc just warn that something is
 > wrong. The message should carry a warning that potentially a
 > leak was created. If a developer wants to reproduce this
 > [s]he is free to use a "checked build" version of the
 > application, compiled against the "location"-aware talloc
 > headers and .so.

That would mean we're back to the problem of two versions of talloc in
the same binary, one with 'checked build', one without.

 > This way we don't have to break anything and just increment
 > the minor version number due to talloc_reparent being added.

nope, I am not going to let the shared library force me into a
situation where I can't improve talloc when I think that improvement
is worthwhile.

Adding the error messages is a worthwhile improvement. I'm not going
to compromise on code quality over this silly focus on a set of shared
libs used by a handful of apps, all of which currently use the libs in
a completely broken way.

I am also going to continue to make further improvements to talloc as
they come up. If that means we need to change the .so number to 3, 4
or even 37 in the future then so be it. That is what the .so number is
for.

I think the same principle should apply to all the shared libs that
are currently being exported from Samba by various people. We are
certainly going to be making improvements to libtdb, libdcerpc,
libndr*, libldb and all the other libs over the years. To start
compromising on the quality of the code in Samba in order to keep
those libs from changing version number is a classic case of the tail
wagging the dog.

The people who package the libs, such as Simo does for RedHat, are of
course free to put whatever hacks they like in their own versions. If
he wants to add symbol versioning or backwards compatibility hacks
then that is completely up to him. That is part of the life of library
maintainers. It is not reasonable to push those hacks on the upstream
source.

Cheers, Tridge

Re: the sorry saga of the talloc soname 'fix'

by simo-7 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

[long mail ahead, take your time]

On Mon, 2009-07-06 at 20:21 +1000, tridge@... wrote:
> Hi Simo,
>
> I've now spent all day looking at libtalloc and how it interacts with
> what is currently in Ubuntu Jaunty. I have downloaded a Fedora image
> but haven't yet installed it to see if Fedora is as badly placed as
> Ubuntu is.
>
> The result of my investigation is that libtalloc is a complete mess.

Nope it's not libtalloc that is a mess, it's bugs in the samba4 build
dependencies that are a mess. libtalloc standalone is fine.

For the record I found that the build problem is present also in the
F-11 samba4-libs package, I am working on fixing the problem in Fedora
too, and will push fixed libraries asap.

Also for the record, I pointed out these build problems no less than 2
SambaXPs ago when I found the horrible mess with the event library
globals. The mess was caused by the fact libevent was statically built
multiple times within samba4 itself so that 2 parts of the same binary
were actually using 2 different sets of libevent symbols duplicating
globals which were no more globals and making debugging almost
impossible.

Then an year ago when we started working on providing libraries for
openchange I pointed out to Jelmer that libdceprc&co had again build
problems as they were including static copies of talloc, tdb, tevent and
ldb (and in some case not all symbols were statically built in, also
they were not made static to the library, the library was exposing them
as public symbols).

This should have been fixed but apparently there were still some build
issues with alpha7.

> It turns out that with current Ubuntu we cannot completely avoid
> having both the old talloc and the new talloc in the same process at
> the same time. However, if we bump the .so number then at least the
> developers will get a warning about the mess.
>
> I've put together some files for you to look at to give you some idea
> of just how bad this whole mess is. See
>
>   http://samba.org/tridge/talloc_mess/
>
> The files are:
>
>  - a libtalloc 1.2.0 (dev and lib) matching what is currently in Ubuntu
>    Jaunty
>
>  - a libtalloc 1.4.0 (dev and lib) matching what is produced if we
>    followed your suggested course of action
>
>  - a libtalloc 2.0.0 (dev and lib) matching what is produced if we
>    follow my preferred choice of using a new .so number
>
>  - source tar balls with debian build rules for all of the above
>
>  - a sample 'testtalloc' package that demonstrates the problems (deb
>    plus source)
>
> The testtalloc package produces two binaries. One is called test_ldb,
> and it creates a ldb then tries to free it with talloc_free() which is
> about as simple a ldb program as you can have. The other is called
> test_mapi which initialises the MAPI subsystem from openchange then
> uses talloc_report_full() to show the memory that has been used.
>
> I chose these two binaries as they demonstrate different types of
> brokenness in the way that talloc/ldb/mapi/samba/openchange etc have
> all been packaged. For example:
>
>   - The libldb-samba4-0 package provides a libldb.so.0 which has a
>     built in static copy of talloc.
>
>   - the libmapi.so package links to a dynamic libtalloc.so, but also
>     links to libdcerpc.so
>
>   - libdcerpc.so has a staticly linked talloc built in
>
>   - etc etc
>
> The same type of brokenness is rife through all the various packages
> that use talloc currently.

I am sorry you wasted time building this testtalloc binary, as
demonstrating that including talloc statically in another library is not
necessary. It is evidently broken, there is no need to prove that.

The real problem I see is that you don't recognize that the soname
version is totally orthogonal to this problem.

> If we used the approach you are advocating, then all of these packages
> (ldb, openchange, mapi, samba etc) won't be marked as needing to be
> rebuilt. Yet they will all abort with no error message when you
> actually use them, because they will mix the two incompatible
> ABIs. Try the test_ldb and test_mapi binaries to see the abort.

Sorry Tridge,
but so far you strongly advocated the soname bump because that way you
could install both version of the library.

Having 2 version of the library in the same process is exactly the same
kind of brokeness as having a static copy of talloc in a library that
lives in a process that also loads a dynamic version of the library.

> If we use the approach that I prefer, which is to change the .so
> number to 2, then at least the developers get a nice warning like
> this:
>
>   /usr/bin/ld: warning: libtalloc.so.1, needed by /usr/lib/gcc/x86_64-linux-gnu/4.3.3/../../../../lib/libmapi.so, may conflict with libtalloc.so.2
>
> So at least someone gets told that it won't work at build time, which
> gives some hope that it might get fixed.

This will happen only if talloc.so.1 is not available on the system, So
far you advocated having both as a solution to avoid rebuilding all
packages that depend on talloc.

But if you remove talloc.so.1 you will have to either remove all
dependencies, or rebuild all dependencies against the new library.

> If we up the .so number to 2 then you can also see the brokenness by
> looking at the dependencies, because we are explicitly marking the ABI
> as having changed. It is easy to see the brokenness using ldd, or by
> using dpkg.

No for libraries that compiled talloc statically ldd will tell you
nothing. As for dpkg you may be lucky if someone explicitly marked
libtalloc as a dependency. But then it depends on how it was done.

normally manual dependencies are of the form: libtalloc >= 1.2.0

This will not trigger any check in dpkg if you want to install libtalloc
= 2.0.0 as 2.0.0 is > 1.2.0

> If we don't do this then we're saying "the ABI is the same" when it
> isn't. This is clearly shown by the abort in the test progams above,
> regardless of whether you install the 1.4.0 libtalloc or the 2.0.0
> libtalloc.

No the abort above doesn't say anything about the ABI, it just tells you
that building talloc statically into those libraries is completely
broken as it is. Whether the dynamic version of talloc is called 1.4.0
or 2.0.0 makes no difference for that kind of brokenness, it's an
orthogonal problem.

> So even with your attempts to make the ABIs more similar by putting
> backward compatibility code into talloc.c we get aborts because the
> internal structures are not compatible (which is nicely caught by
> Metze's patch).

This happens only for the already broken libraries. For sane
binaries/libraries there is no problem at all. Try yourself to build
against libtalloc 1.2.0 and then install libtalloc 1.4.0, the
application will be just fine because the ABI *is* compatible.

>   Your attempts to make the ABIs compatible are not
> enough, and would pollute the code with a lot of cruft that serves no
> purpose, plus it will remove the warnings that developers that would
> otherwise get when things are going to go wrong with some of the
> libraries.

No, you are mixing ABI problems with broken libs problems.
If you mix these two things you can argue any solution, but they will
all be equally wrong, those libraries are simply broken.

If you want to argue about what soname version to use you have to use a
non-broken system. In a non-broken system what you have is that all
those libraries depend on libtalloc.so.1 and they do not statically link
copies of talloc.


Please pay attention to the following specific example, because it
explains my perspective on why on a system with non broken libraries a
soname bump requires to rebuild all packages and is a problem more for
*developers* or people that build their own packages for sources rather
than for package users.

When you suggested the soname bump you said multiple times that you
wanted to do so because this way you could install both libtalloc.so.1
and libtalloc.so.2 at the same time. That is when I got extremely
worried about all this business and is part of the reason why I proposed
the patch to keep the ABI compatible and not bump the soname.
The reason is quite simple, if you think about the following scenarios
where you have a non-broken system.

---
Situation A)
Talloc with soname = 2.0.0:

Assume we fix a minor bug in ldb and release a new version. A developer
fetches the new ldb and finds out it now requires talloc 2.0.0, he
happily builds and install talloc.so.2, a new tevent and the new ldb
with fixes.

All builds fine and libldb now depends on libtalloc.so.2

And here comes the problem.  In a non-broken system libdcerpc will tell
the dynamic linker it needs libtalloc.so.1 and libldb.so
When the linker will load both you will end up with libtalloc.so.1 and
libtalloc.so.2 (via libldb.so) in the same process, aborts will be
everywhere and the developer will not have seen anything at build time
nor at program start-up time, all dependencies are fine.

---
Situation B)
Talloc with soname = 1.4.0:

Assume we fix a minor bug in ldb and release a new version. A developer
fetches the new ldb and finds out it now requires talloc 1.4.0 or the
build fails. He happily builds and install talloc.so.1 (NOTE: this
overwrites the original talloc library), a new tevent and the new ldb
with fixes.

All builds fine and libldb still depends on libtalloc.so.1

In a non-broken system libdcerpc will tell the dynamic linker it needs
libtalloc.so.1 and libldb.so, when the loader will load both you will
end up only with the new libtalloc.so.1 library. No problems, no aborts
of sorts, all just works as expected. The only issue you may have is
that the old libdcerpc may leak some memory when using the old
interfaces, this is no worse than before, it is actually exactly the
same behavior as before, and will be fixed as soon as libdcerpc is
upgraded (as it will be built against the new talloc).


----

I hope you see the striking difference between bumping and non-bumping
in a non-broken system.

What I want to know is if you understand what it means to have a library
in 2 versions when it is included in so many other libraries.

Do you think that that situation A is better or worse than situation B ?

Whether 1.4.0 or 2.0.0 are right or wrong ultimately depends on which of
the 2 situation above we agree is better or worse.

Obviously I think A is much worse than B, but we can discuss the 2
scenarios and come to an agreement.

If we keep the discussion on technical grounds and do not accuse people
of plotting, lying or whatever, and we avoid petty flames on who owns
some piece of code I think we will use our time in a lot more productive
way.


> So Simo, please look at the above examples, then please revert your
> commit. Also, in future, please don't revert a maintainers commits
> without checking with the maintainer.
>
> Also, Metze, you were right, your abort() check on version really is
> needed, and really does happen with real examples. Thanks!

Yes, we should probably even think of automatically change the magic at
every build or at the very least at every release (it might include the
version number).

> To prevent this happening in future we have to stop mixing staticly
> linked libraries with shared versions of the same libs. That will mean
> a lot of changes to the way that lots of libs are produced by the
> Samba project and how they are linked into projects like openchange.

These changes have been advocated by me for long, starting more than 2
years ago, for exactly these reasons, glad that finally someone else
realizes this, it only took 2 years ...

> I hope I don't have to spend another day like today tracing shared
> library problems. As I have said several times previous when proposals
> of Samba shared libs come up, getting shared libs right is really
> _really_ hard. We have come nowhere near to getting it right yet, and
> the work required to get it right is quite substantial. I'm not
> volunteering to do the work.

To be honest, no, getting shared libs right is not really hard, it only
requires a bit of care for shared libraries specific needs, and clear
code dependencies. The problem with samba is that we have quite a bit of
spaghetti dependencies, and the only way to make useful libraries is to
untangle some of this code. I've been working slowly with Andrew to try
to unravel some of that, unfortunately I've been busy in the last year
or so, so my progress in this area has been very slow.

I've been working (asking mostly) for 2 years to Jelmer and Metze to
help me fix the samba4 build system so that code wouldn't be statically
linked to libraries, or even (horror) linked multiple times within the
samba4 binaries themselves. Unfortunately the build system is so
complicated only Jelmer and Metze seem to understand how it works, so
any fix depend on them getting involved and potentially the person who
wrote and understand some deep down code to provide hooks or to break
some internal dependency. It's long and tedious work, but not
conceptually hard.


The other issue, also not very hard but important, is that basic
libraries should try to avoid breaking the ABI as much as possible,
because a soname bump is not a solution if different inter-dependent
libraries can be rebuilt at different times. Any incompatible change of
a basic library requires rebuilding all packages linking to the previous
version, and an upgrade of the library. This is a real issue when the
number of libraries and packages dependent on a basic library start
growing.

Talloc, being a memory allocation library, is one such basic library
(but tdb, ldb, tevent are also there to a lesser extent), that is why I
strongly believe we should do all we can to keep the ABI stable.
If we are not willing to make the promise that we will do *all* we
possibly can to maintain library ABIs stable we are going to cause a lot
of problems to all users down the road.
Unfortunately we do not have much choice, we encouraged other people to
use these libraries. Openchange is one of the projects that totally
depends on these libraries, and any change in API/ABI is going to
negatively impact them. I am using them in sssd. Other people have said
or expressed the desire to use them.

So, where do we stand?
Can we take some responsibility not to break our users unless really
necessary ?


Simo.

--
Simo Sorce
Samba Team GPL Compliance Officer <simo@...>
Principal Software Engineer at Red Hat, Inc. <simo@...>


Re: the sorry saga of the talloc soname 'fix'

by tridge@samba.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Simo,

If you want the binary backward compatibility stuff in your RedHat
packages of talloc, then please add it there. Keep it for as many
years as you think it is needed. Maintain those changes in whatever
way you want.

I will try to keep API compatibility in lib/talloc in Samba as far as
is reasonably possible without compromising the code. I don't promise
the API won't ever change. I certainly don't promise the ABI won't
change.

A huge proportion of the Samba codebase is now being exported as
shared libs and used by projects like OpenChange. I'm delighted that
they are using it, and I'm sure they will be able to continue to use
it in the future without us having to commit to no ABI changes across
all the code they are using.

A lot of the changes we make to pidl end up breaking the ABI. Each
time we edit an IDL file we break the ABI. Numerous tdb changes over
the years have broken the ABI. If we had committed to the ABI never
changing on those a few years back we'd now be up to our necks in
backwards compatibility hacks.

If you think that is the only way we can do Samba libs then I
disagree, and I'd also say that if it was the price, then it is not a
price the Samba project should pay. Shared libs are a nice sideline,
they are not the core purpose of the Samba project. They are the
responsibility of those who decide to produce the shared libs. I
admire the efforts of those who choose to do it, but I do not accept
that the rest of the project must be driven by the imperatives of
those shared library efforts.

 > This will happen only if talloc.so.1 is not available on the system, So
 > far you advocated having both as a solution to avoid rebuilding all
 > packages that depend on talloc.

nope, in you only get this warning from the loader when both
libtalloc.so.1 and libtalloc.so.2 are on the system.

 > > If we up the .so number to 2 then you can also see the brokenness by
 > > looking at the dependencies, because we are explicitly marking the ABI
 > > as having changed. It is easy to see the brokenness using ldd, or by
 > > using dpkg.
 >
 > No for libraries that compiled talloc statically ldd will tell you
 > nothing.

It does tell you what is wrong. The loader gives you the warning I
showed, then when you run ldd on the binary and the library that the
warning pointed you at it shows you that you have a binary that
depends on two incompatible versions of libtalloc.

We also get a runtime abort. Once we have talloc_set_log_fn() then we
should make that abort print a useful message as well.

Cheers, Tridge

Re: the sorry saga of the talloc soname 'fix'

by Volker Lendecke :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, Jul 07, 2009 at 01:17:20PM +1000, tridge@... wrote:
>  > This way we don't have to break anything and just increment
>  > the minor version number due to talloc_reparent being added.
>
> nope, I am not going to let the shared library force me into a
> situation where I can't improve talloc when I think that improvement
> is worthwhile.

If we were talking about a significant improvement, I would
be with you. But IMO improved error messages don't justify
this.

Volker


signature.asc (204 bytes) Download Attachment

Re: the sorry saga of the talloc soname 'fix'

by tridge@samba.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Volker,

 > If we were talking about a significant improvement, I would
 > be with you. But IMO improved error messages don't justify
 > this.

If you had spent time trying to track down the problems that this
patch fixed without the use of those error messages (as I tried to do
at first), then you might rate the importance of the messages.

We've fixed a lot of bugs thanks to those new error messages, I am
have thus become rather fond of them!

Cheers, Tridge

Re: the sorry saga of the talloc soname 'fix'

by Sam Liddicott :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

tridge wrote:

> Hi Volker,
>
>  > If we were talking about a significant improvement, I would
>  > be with you. But IMO improved error messages don't justify
>  > this.
>
> If you had spent time trying to track down the problems that this
> patch fixed without the use of those error messages (as I tried to do
> at first), then you might rate the importance of the messages.
>
> We've fixed a lot of bugs thanks to those new error messages, I am
> have thus become rather fond of them!
>  
I think they a great, and the tracking of point-of-reference in talloc
is one of it's distinct advantages.

In this case the end-user has already had the advantages of this change
by means of the bugs you were able to discover.

The benefit being already delivered, shipping this change will now bring
mostly pain to the end user.

I think this is the point at which we would start a new branch of
talloc, but sharing the same repo as the rest of Samba makes it hard.

Sam

Re: the sorry saga of the talloc soname 'fix'

by Andrew Bartlett :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, 2009-07-07 at 15:46 +1000, tridge@... wrote:

> Hi Volker,
>
>  > If we were talking about a significant improvement, I would
>  > be with you. But IMO improved error messages don't justify
>  > this.
>
> If you had spent time trying to track down the problems that this
> patch fixed without the use of those error messages (as I tried to do
> at first), then you might rate the importance of the messages.
>
> We've fixed a lot of bugs thanks to those new error messages, I am
> have thus become rather fond of them!
I think Volker is right, that given we can easily (for some definition
of easily) avoid changing the ABI, then it's an easier way out of this
mess.  Had you proposed a more radical solution, there would perhaps
have been less resistance :-)

But I wonder if we really have preserved the ABI.  Given that
talloc_free() and talloc_steal() now does a different thing, won't it
now allow a use-after-free?  This would be an ABI change (even if not a
signature change), and justify the change.

This is more than just a cleanup of possible memory leaks, isn't it?

Andrew Bartlett

--
Andrew Bartlett
http://samba.org/~abartlet/
Authentication Developer, Samba Team           http://samba.org
Samba Developer, Cisco Inc.


signature.asc (196 bytes) Download Attachment

Re: the sorry saga of the talloc soname 'fix'

by Andrew Bartlett :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, 2009-07-07 at 07:43 +0100, Sam Liddicott wrote:

> tridge wrote:
> > Hi Volker,
> >
> >  > If we were talking about a significant improvement, I would
> >  > be with you. But IMO improved error messages don't justify
> >  > this.
> >
> > If you had spent time trying to track down the problems that this
> > patch fixed without the use of those error messages (as I tried to do
> > at first), then you might rate the importance of the messages.
> >
> > We've fixed a lot of bugs thanks to those new error messages, I am
> > have thus become rather fond of them!
> >  
> I think they a great, and the tracking of point-of-reference in talloc
> is one of it's distinct advantages.
>
> In this case the end-user has already had the advantages of this change
> by means of the bugs you were able to discover.
>
> The benefit being already delivered, shipping this change will now bring
> mostly pain to the end user.
Except the benefit is not already delivered, and I'm still finding new
issues.  The remaining ones will be the painful bugs, only found in the
real world, that I know we all love to try and debug on the unmodified
binaries of a customer site.

Andrew Bartlett

--
Andrew Bartlett
http://samba.org/~abartlet/
Authentication Developer, Samba Team           http://samba.org
Samba Developer, Cisco Inc.


signature.asc (196 bytes) Download Attachment

Re: the sorry saga of the talloc soname 'fix'

by Stefan (metze) Metzmacher :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

tridge@... schrieb:

> Hi Simo,
>
> If you want the binary backward compatibility stuff in your RedHat
> packages of talloc, then please add it there. Keep it for as many
> years as you think it is needed. Maintain those changes in whatever
> way you want.
>
> I will try to keep API compatibility in lib/talloc in Samba as far as
> is reasonably possible without compromising the code. I don't promise
> the API won't ever change. I certainly don't promise the ABI won't
> change.
I'll try to explore some alternatives like compat libraries as we've
discussed while deciding not to use symbol versioning for libwbclient.

Maybe we can change the so name to .so.2 and provide a .so.1 which
relies on the .so.2 one and doesn't cause duplicate symbols. when
they're loaded together.

metze



signature.asc (260 bytes) Download Attachment

Re: the sorry saga of the talloc soname 'fix'

by tridge@samba.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Andrew,

 > I think Volker is right, that given we can easily (for some definition
 > of easily) avoid changing the ABI, then it's an easier way out of this
 > mess.

The whole argument of not changing the ABI is based on saving our
'users'. As I have demonstrated, all of our users are completely
broken whatever happens. All of them mix in static and dynamic
versions of talloc. The only place it is a bit saner is in debian
experimental, and the whole point of experimental is that it breaks
all the time - it offers no promises whatsoever.

There is nothing I can do to fix that. There is no point in taking on
an extra maintainence burden for zero benefit, and I fundamentally
reject the idea of the core of Samba taking on extra maintainence
burdens like this due to the needs of the shared libs. That approach
will lead to us drowning in ABI hacks very quickly.

 > This is more than just a cleanup of possible memory leaks, isn't it?

right - it is fixing a fundamental bug in the talloc API. Keeping the
old semantics for the existing 'users' is the approach Simo has
taken. I don't _want_ the old semantics to be kept, as they are
broken.

The 'users' need to recompile anyway, as they are all mixing static
and dynamic linking. At least with a soname bump they get a warning
from the linker if there is some lib they use that is not recompiled.

So no, I am not going to accept this patch from Simo. I still want him
to revert his patch. If he doesn't then I will revert it instead.

Cheers, Tridge

Why shared libraries have version numbers (was: the sorry saga ...)

by David Collier-Brown-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

  The purpose of shared library version numbers is to allow
controlled, continuous change.  The implementer of
the called functions produces a new implementation with
some changed APIs or ABIs, but the consumers of the
old API/ABI continue to work without change.

  At some time convent to them, they change to the new
API calls and the new library, test it and continue.

  This is entirely consistent with what Tridge is doing:
I've bumped versions of apptrace(8) code with wild
abandon, and the consumers of it don't care, until
such time as they chose to upgrade and get a change
database (a flat file containing sed and m4 scripts)
from me and do the upgrade.

  This is nothing new: it dates back to Multics, where
I learned it at about the time Unix v7 was coming out.
See also http://www.multicians.org/stachour.html and
http://datacenterworks.com/stories/port.html

--dave
 

tridge@... wrote:

> Hi Andrew,
>
>  > I think Volker is right, that given we can easily (for some definition
>  > of easily) avoid changing the ABI, then it's an easier way out of this
>  > mess.
>
> The whole argument of not changing the ABI is based on saving our
> 'users'. As I have demonstrated, all of our users are completely
> broken whatever happens. All of them mix in static and dynamic
> versions of talloc. The only place it is a bit saner is in debian
> experimental, and the whole point of experimental is that it breaks
> all the time - it offers no promises whatsoever.
>
> There is nothing I can do to fix that. There is no point in taking on
> an extra maintainence burden for zero benefit, and I fundamentally
> reject the idea of the core of Samba taking on extra maintainence
> burdens like this due to the needs of the shared libs. That approach
> will lead to us drowning in ABI hacks very quickly.
>
>  > This is more than just a cleanup of possible memory leaks, isn't it?
>
> right - it is fixing a fundamental bug in the talloc API. Keeping the
> old semantics for the existing 'users' is the approach Simo has
> taken. I don't _want_ the old semantics to be kept, as they are
> broken.
>
> The 'users' need to recompile anyway, as they are all mixing static
> and dynamic linking. At least with a soname bump they get a warning
> from the linker if there is some lib they use that is not recompiled.
>
> So no, I am not going to accept this patch from Simo. I still want him
> to revert his patch. If he doesn't then I will revert it instead.
>
> Cheers, Tridge
>
>  


--
David Collier-Brown,         | Always do right. This will gratify
System Programmer and Author | some people and astonish the rest
davecb@...           |                      -- Mark Twain
(416) 223-8968


Re: Why shared libraries have version numbers (was: the sorry saga ...)

by Volker Lendecke :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, Jul 07, 2009 at 10:42:02AM -0400, David Collier-Brown wrote:
>   The purpose of shared library version numbers is to allow
> controlled, continuous change.  The implementer of
> the called functions produces a new implementation with
> some changed APIs or ABIs, but the consumers of the
> old API/ABI continue to work without change.

How do you cope with a program that requires version .2 that
also links in a library that requires version .1?

Volker


signature.asc (204 bytes) Download Attachment

Re: Why shared libraries have version numbers

by David Collier-Brown-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Volker Lendecke wrote:

> On Tue, Jul 07, 2009 at 10:42:02AM -0400, David Collier-Brown wrote:
>  
>>   The purpose of shared library version numbers is to allow
>> controlled, continuous change.  The implementer of
>> the called functions produces a new implementation with
>> some changed APIs or ABIs, but the consumers of the
>> old API/ABI continue to work without change.
>>    
>
> How do you cope with a program that requires version .2 that
> also links in a library that requires version .1?
>
> Volker
>  
   You speak unkindly to the programmer (;-))

  Joking aside, if the program in question is very large, like
samba itself, you might have to use a tactic like running both
samba 3 and as cooperating processes.  If it's smaller, I'd try
backing out to the previous version by looking in old GIT
versions and adding the old code back with #ifdefs.

  Having competing version suggests that either two people were
working at cross-purposes, or someone accidentally did only half
the job of upgrading to the new version.  This should fail utterly on
Solaris, by the way: if it doesn't could you tell me and I'll speak to
the ld.so developers, who used to be my office neighbors.

--dave

--
David Collier-Brown,         | Always do right. This will gratify
System Programmer and Author | some people and astonish the rest
davecb@...           |                      -- Mark Twain
(416) 223-8968


Re: Why shared libraries have version numbers

by Volker Lendecke :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, Jul 07, 2009 at 11:56:10AM -0400, David Collier-Brown wrote:
> > How do you cope with a program that requires version .2 that
> > also links in a library that requires version .1?
> >
> > Volker
> >  
>    You speak unkindly to the programmer (;-))

Sorry, I was not aware of the rudeness of my language. I'm
not a native speaker, sorry.

Volker


signature.asc (204 bytes) Download Attachment

Re: Why shared libraries have version numbers

by David Collier-Brown-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

   I was being humorous: you're English is perfectly fine!

   I was suggesting the first thing you do is tell the programmer
using two versions of the same library to not do that!

--dave

Volker Lendecke wrote:

> On Tue, Jul 07, 2009 at 11:56:10AM -0400, David Collier-Brown wrote:
>  
>>> How do you cope with a program that requires version .2 that
>>> also links in a library that requires version .1?
>>>
>>> Volker
>>>  
>>>      
>>    You speak unkindly to the programmer (;-))
>>    
>
> Sorry, I was not aware of the rudeness of my language. I'm
> not a native speaker, sorry.
>
> Volker
>  


--
David Collier-Brown,         | Always do right. This will gratify
System Programmer and Author | some people and astonish the rest
davecb@...           |                      -- Mark Twain
(416) 223-8968


Re: the sorry saga of the talloc soname 'fix'

by Steve Langasek :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, Jul 07, 2009 at 01:17:20PM +1000, tridge@... wrote:
> The people who package the libs, such as Simo does for RedHat, are of
> course free to put whatever hacks they like in their own versions. If
> he wants to add symbol versioning or backwards compatibility hacks
> then that is completely up to him. That is part of the life of library
> maintainers. It is not reasonable to push those hacks on the upstream
> source.

Symbol versioning is not a hack.  Shared library implementations that lack
support for symbol versioning are deficient because of precisely the problem
being described in this thread.  Use of symbol versions should be the
standard for all shared libraries on systems that support them, and should
be *mandatory* for any libraries which are used by other libraries.

Which means that the symbol versioning belongs in the upstream library, not
pushed down on the packagers where a dozen different distributions will have
to reimplement the same thing, with a high risk of introducing binary
incompatibilities (by picking different symbol version names).  If it comes
to it, I will gladly coordinate with Simo to ensure Debian and Ubuntu wind
up with the same symbol versioning implementation as Red Hat, but symbol
versions are part of a library's ABI and are best dealt with upstream, not
by packagers.

--
Steve Langasek                   Give me a lever long enough and a Free OS
Debian Developer                   to set it on, and I can move the world.
Ubuntu Developer                                    http://www.debian.org/
slangasek@...                                     vorlon@...

Re: the sorry saga of the talloc soname 'fix'

by Steve Langasek :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Mon, Jul 06, 2009 at 11:45:32PM -0400, simo wrote:

> > If we up the .so number to 2 then you can also see the brokenness by
> > looking at the dependencies, because we are explicitly marking the ABI
> > as having changed. It is easy to see the brokenness using ldd, or by
> > using dpkg.

> No for libraries that compiled talloc statically ldd will tell you
> nothing. As for dpkg you may be lucky if someone explicitly marked
> libtalloc as a dependency. But then it depends on how it was done.

> normally manual dependencies are of the form: libtalloc >= 1.2.0

Er, definitely not.  The convention with dpkg-based distributions is to
include the sover as part of the package name; there is no 'libtalloc'
package in Debian, only 'libtalloc1'.

But this does *not* solve the problem that libtalloc1 and libtalloc2 may be
installed, and the dependency tree for a given binary may include both of
them.  We can make libtalloc2 Conflict: with libtalloc1 to force an atomic
upgrade of all related packages, but symbol versioning is the more elegant
solution.

--
Steve Langasek                   Give me a lever long enough and a Free OS
Debian Developer                   to set it on, and I can move the world.
Ubuntu Developer                                    http://www.debian.org/
slangasek@...                                     vorlon@...
< Prev | 1 - 2 - 3 - 4 | Next >