APT2, APT2, where are you?

View: New views
20 Messages — Rating Filter:   Alert me  
< Prev | 1 - 2 | Next >

APT2, APT2, where are you?

by Julian Andres Klode-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

APT2, APT2, where are you?
================================

This e-mail shall give an overview on my recent thoughts and activities
on APT2. Yes, I know, it's progressing very slowly.


Repositories
------------
While working on APT2, I looked at three repository formats, each for a
different package manager: Debian, RPM, Slackware. From what I have
seen, I can conclude that there are many similarities in the data.

First of all, each of this repositories contains some kind of a
meta-index, which is optionally signed. For Debian, this is Release (or
the new InRelease files); for RPM it is repomd.xml; and for Slackware
one can use CHECKSUMS.md5.

Secondly, all repositories provide some kind of package indexes. For
Debian, these are the Packages files; for RPM it is primary.xml; and
for Slackware it is Packages.txt. In the common dists/ and pool/
structure, Debian package indexes are split by architecture and into
multiple components. The package manager can thus restrict the needed
files to the current architectures and the requested components. RPM
package indexes are shared by all architectures, and there is no
concept of components. For slackware repositories, we can assume that
there is only one architecture.

The package indexes provide us with a list of packages, their version,
dependencies, provided features and a human readable description. For
Slackware, dependency support is completely optional and not even
supported by the distribution itself (by design). RPM package indexes
also provide files lists for certain files in the packages (e.g.
configuration files).

Thirdly, there are source indexes. For Debian, they are like package
files, but just not architecture-specific. SRPM distributions are
exactly like RPM, we just have to deal with the source packages at
extraction time. Slackware is harder, there are CHECKSUMS.md5 in the
sources directory, and we may be able to calculate the location of a
source using it and the PACKAGES.txt file.

Fourthly, there a file lists. Debian has per-architecture Contents-*.gz
files, RPM has filelists.xml and Slackware has MANIFEST files in the
subdirectories.

One last word on Slackware: It might make sense to use this
components/section approach of Debian and apply it to Slackware as
well, treating each directory (extras, slackware) as one section.

All in all we have for elements:
    Metaindex
    Package indexes
    Source indexes
    File indexes

Resolving dependencies, etc.
----------------------------

When resolving dependencies, we need 3 pieces of information:
    (a) the list of installed packages; reported by the low-level
        package manager.
    (b) the available packages, as reported by the repositories.
    (c) the request, i.e. which packages to install or remove.

From this information, we can create a changeset. A changeset includes
all actions which have to be done to satisfy the request. It can thus
be expressed using the same data structures as a request; i.e. a tuple
(package,version, type of comparison,action). There is no need to
modify any kind of cache like in APT; instead we simply carry out the
actions and reload the list of installed applications afterwards.

APT treats a package as a set of versions. This kind of handling has
problems when it comes to things like multi-arch and multiple versions
installed at the same time (e.g. in RPM). Thus we treat each version as
a single package; and can allow multiple versions or packages from
multiple architectures to be installed at the same time.

External dependency solvers will be supported. I had an e-mail
conversation with zack on this topic, and he asked me whether this
would be possible. I'm just waiting for his final proposal on this
topic. This should probably be coordinated with cupt as well, so we can
share external solvers. The exchange format could be CUDF[0].

Caching
-------
The mmap()'able binary cache in APT has become a problem with growing
repository sizes because it is practically not possible to resize it on
the fly. APT gained support for using mremap(), but this does not work
in practice. You also have to work around all the pointers converting
them to locations relative to the beginning of the file when storing
and to the position in memory when using them. That's why I don't plan
to use such a cache format.

Instead, the cache format should be a subset of the Debian package
information files, which only includes basic information needed to
resolve dependencies. This solution is still multiple times faster than
using no such cache at all.

Handling file acquisition
--------------------------
File acquisition could be handled by multiple worker threads, whereas
workers are written as shared libraries and loaded using GModule. The
question here is how we shall handle graphical platforms (asynchronous
methods?) and how to know which module supports which protocol. For the
latter problem, there are two ways:

    (a) identify the protocol using the filename, e.g. libhttp.so
    (b) identify the protocol using information contained in the
        library, and have optional priority, e.g.:

            struct protocol {
                string protocol;
                int priority;
            }


Handling platform-specific stuff
--------------------------------
We can use modules to provide implementations of abstract classes which
provide common functionalities like updating repository data. But we
might also want to allow developers to write platform-specific
programs, how should this be done if there is no access to those
classes?

Alternatively, we could link in the platform-specific parts and only
allow one platform in one installation. This has the advantage of
providing the specific parts, but the disadvantage that you can not use
APT2 on distribution X to create a chroot of distribution Y which uses
a different package manager.

Another possible option would be to export the specific parts into
libraries named e.g. libapt-debian and libapt-rpm and install them to
/usr/lib.

Target platforms for APT2
-------------------------
The primary target platform of APT2 is Debian and distributions derived
from it. Once this platform works, support for others may be added.

Communicating messages and errors
---------------------------------
The libapt library will use the logging facilities provided by GLib to
output information. Errors will be handled by GError where useful,
otherwise they will be send to the log as level CRITICAL and the
function returns false/null.

Applications have to setup the display of the logging domain "apt",
e.g. by printing it on the screen. It's their task to format the
messages in an appropriate manner. There may be a library "libapt-gtk"
providing widgets for graphical applications.

An exception from this rule could be added for the acquire subsystem,
which could add the error message into a field of the item.

Bindings to other languages
---------------------------
The recommended language for development of APT2 applications is Vala.
I will also support C (of course, it's done automatically by valac) and
Python. Other languages may be supported using GObject-introspection,
e.g. JavaScript.

Licensing
---------
APT2 could be licensed under the terms of the GNU Lesser General Public
License, version 2.1 or (at your option) any later version. This
license is widely used in the GNOME world for libraries and since we
are using a lot of technologies coming from there, this seems to be a
good choice.

Another option is the Apache license 2.0, but it's incompatibility with
version 2 of the GPL is not very helpful.

Progress and ToDo
-----------------

APT2 is moving very slowly, and has at the moment

WORKING
    - Parser for /etc/apt.conf and other configuration files
    - Parser for 822 tag files (although it will be rewritten)
    - Single-threaded file acquisition using GIO and libsoup, no
      support for authentification
PROGRESS
    - Parser for /etc/apt/sources.list and similar files
    - Repository handling, e.g. apt-get update can be done using my
      local branch.
TODO
    - Multi-threaded file acquisition using modules.
    - Progress reporting for file acquisition.
    - Support for PDiffs.
    - Taking care of integration with the GLib mainloop.
    - PackageCache, SourceCache, FileCache
    - Dependency solvers.
    - Final license decision

GOALS for 0.0.1:
    - COMMAND: apt-get install
    - COMMAND: apt-get source --download-only
    - COMMAND: apt-get update
    - External dependency solvers, possibly using the CUDF format. This
      was a feature request by zack.

LONG TERM GOALS:
    - Replace APT and aptitude on the command-line, and APT's libraries
    - Replace the D-Bus server provided by aptdaemon
    - Replace synaptic/gnome-app-install/software-center, or port
      software-center over from apt.
    - Get a native built-in SAT resolver.

Links
-----
[0] http://upsilon.cc/~zack/research/publications/mooml-iwoce-2009.pdf



signature.asc (204 bytes) Download Attachment

Re: APT2, APT2, where are you?

by Stefano Zacchiroli :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Nov 04, 2009 at 06:18:05PM +0100, Julian Andres Klode wrote:
> Resolving dependencies, etc.
> ----------------------------
<snip>
> External dependency solvers will be supported. I had an e-mail
> conversation with zack on this topic, and he asked me whether this
> would be possible. I'm just waiting for his final proposal on this
> topic. This should probably be coordinated with cupt as well, so we can
> share external solvers. The exchange format could be CUDF[0].

I confirm that I'm active on this, we've mostly finished the spec for
CUDF. Once that is done (no more than a couple of weeks), I'll get back
to you with a format specification based on CUDF syntax (which is plain
text / RFC 822, don't worry about XML or similar stuff), but better
suited for Debian and with a connector to use it with any CUDF-based
solvers.

If the interest on this topic is more general, we can also move the
discussion about this kind of stuff on this list.

Cheers.

--
Stefano Zacchiroli -o- PhD in Computer Science \ PostDoc @ Univ. Paris 7
zack@{upsilon.cc,pps.jussieu.fr,debian.org} -<>- http://upsilon.cc/zack/
Dietro un grande uomo c'è ..|  .  |. Et ne m'en veux pas si je te tutoie
sempre uno zaino ...........| ..: |.... Je dis tu à tous ceux que j'aime


signature.asc (197 bytes) Download Attachment

Re: APT2, APT2, where are you?

by Michael Vogt-5 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thu, Nov 05, 2009 at 09:39:36AM +0100, Stefano Zacchiroli wrote:

> On Wed, Nov 04, 2009 at 06:18:05PM +0100, Julian Andres Klode wrote:
> > Resolving dependencies, etc.
> > ----------------------------
> <snip>
> > External dependency solvers will be supported. I had an e-mail
> > conversation with zack on this topic, and he asked me whether this
> > would be possible. I'm just waiting for his final proposal on this
> > topic. This should probably be coordinated with cupt as well, so we can
> > share external solvers. The exchange format could be CUDF[0].
>
> I confirm that I'm active on this, we've mostly finished the spec for
> CUDF. Once that is done (no more than a couple of weeks), I'll get back
> to you with a format specification based on CUDF syntax (which is plain
> text / RFC 822, don't worry about XML or similar stuff), but better
> suited for Debian and with a connector to use it with any CUDF-based
> solvers.
>
> If the interest on this topic is more general, we can also move the
> discussion about this kind of stuff on this list.

Please do, I'm certainly interessted in this. Having support for
external resolvers is something I would like to see in libapt as
well.

Cheers,
 Michael


--
To UNSUBSCRIBE, email to deity-REQUEST@...
with a subject of "unsubscribe". Trouble? Contact listmaster@...


RFC interaction with external dependency solver: Debian-CUDF

by Stefano Zacchiroli :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

[ Explicit Cc to the people which expressed interest in this topic.
  R-T/M-F-T set to deity@... as the most appropriate forum where to
  discuss this. Let me know if I should keep the CC. ]

On Thu, Nov 05, 2009 at 01:19:21PM +0100, Michael Vogt wrote:

> On Thu, Nov 05, 2009 at 09:39:36AM +0100, Stefano Zacchiroli wrote:
> > On Wed, Nov 04, 2009 at 06:18:05PM +0100, Julian Andres Klode wrote:
> > > Resolving dependencies, etc.
> > > ----------------------------
> > > External dependency solvers will be supported. I had an e-mail
> > > conversation with zack on this topic, and he asked me whether this
> > > would be possible. I'm just waiting for his final proposal on this
> > > topic. This should probably be coordinated with cupt as well, so we can
> > > share external solvers. The exchange format could be CUDF[0].
> > I confirm that I'm active on this, we've mostly finished the spec for
> > CUDF. Once that is done (no more than a couple of weeks), I'll get back
> > to you with a format specification based on CUDF syntax (which is plain
> > text / RFC 822, don't worry about XML or similar stuff), but better
> > suited for Debian and with a connector to use it with any CUDF-based
> > solvers.
> I'm certainly interessted in this. Having support for external
> resolvers is something I would like to see in libapt as well.
I'm finally ready to propose a basic format and protocol for interaction
between package managers and (external) dependency solvers. The format
is called "Debian-CUDF" [4] and can describe solver input and
output. The protocol, with a bit of fantasy, can be called "Debian-CUDF
protocol".

First some pointers and explanation.

CUDF itself [1] (which is not the same thing of "Debian-CUDF") is a
format to describe upgrade scenarios in a distro-independent way. As
such it takes care of complex stuff like the different dependency
semantics in dpkg vs RPM and version number normalization. The full
specification of CUDF is a rather lengthy PDF document [3] (not for the
faint of heart!), but there is now a very short primer available on the
web whose read I recommend to anyone interested in discussing this.

  [1] http://www.mancoosi.org/cudf/
  [2] http://www.mancoosi.org/cudf/primer/
  [3] http://www.mancoosi.org/reports/tr3.pdf

Within Debian (and derivatives) we know that the low-level package
manager is dpkg, hence to reuse solvers within Debian we can use a
format which resembles more our package list syntax and semantics. Such
a format is Debian-CUDF [4].

  [4] http://www.mancoosi.org/cudf/debian/
      (assumes basic CUDF knowledge, I suggest reading [2] first)

The idea is that package managers, as soon as they face a dependency
solving problem, will spawn an external process which reads from stdin a
Debian-CUDF document and return on stdout a proposed solution describing
the new set of installed packages.

From here on, several aspects still need to be fleshed out, a few that
come to my mind:

- Error reporting: currently there is no description about how the
  solver reports error to the package manager (e.g. unsolvable
  dependencies). It's rather easy to do, but requires consideration of
  the possible cause of errors and how the package manager wants to
  explain them to the user.

- Interactive solving: above I've implicitly assumed that solving is
  always batch, which is not the case. I've a basic interaction protocol
  that addresses this, but I would first like to agree on the batch case
  as most of the above will be reused in the interactive case.

- Performances: I've the feeling that a big pipe of plain text can be a
  bottleneck; we can image a binary equivalent, but first it'd be better
  to have numbers about the actual impact of the pipe.

- Implementations status:
  * Eugene used to have a preliminary implementation of this in CUPT,
    then due to my delay it has not been maintained. Eugene: do you
    think it would be easy to port the implementation to the above
    format?
  * Julian was waiting for me on a format proposal: here we go! :-)

- Solution description: the above returns solution in "here is the
  new package status" style. Julian was instead proposing something like
  "here is a diff from the previous package status". Mine looks cleaner
  to me, but obviously more bloated.

Comments on any of the above will be much appreciated!
Cheers.

--
Stefano Zacchiroli -o- PhD in Computer Science \ PostDoc @ Univ. Paris 7
zack@{upsilon.cc,pps.jussieu.fr,debian.org} -<>- http://upsilon.cc/zack/
Dietro un grande uomo c'è ..|  .  |. Et ne m'en veux pas si je te tutoie
sempre uno zaino ...........| ..: |.... Je dis tu à tous ceux que j'aime


signature.asc (197 bytes) Download Attachment

Re: RFC interaction with external dependency solver: Debian-CUDF

by Julian Andres Klode-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Am Dienstag, den 08.12.2009, 12:04 +0100 schrieb Stefano Zacchiroli:
>
> - Error reporting: currently there is no description about how the
>   solver reports error to the package manager (e.g. unsolvable
>   dependencies). It's rather easy to do, but requires consideration of
>   the possible cause of errors and how the package manager wants to
>   explain them to the user.
>
I recommend as simple text message response encoded in the preamble
I suggest a response consisting of a preamble with an additional
property describing the cause of the error.

>
> - Solution description: the above returns solution in "here is the
>   new package status" style. Julian was instead proposing something like
>   "here is a diff from the previous package status". Mine looks cleaner
>   to me, but obviously more bloated.
It might look cleaner, but obviously requires more data. It also does
not really match my plans which are based on changes instead of target
state (as partially described in 20091207172049.GA26830@...).

>
> Comments on any of the above will be much appreciated!
> Cheers.
>
I am missing multi-arch support, which is important for the future. I
don't really know about the status of the multiarch spec, but I would
like to see it supported. A package stanza in the solver output would
currently identify more than one exact package (it would identify the
package with the given version on all architectures).

Another thing is that there can be two packages with different
dependencies, but the same name and version. I propose an optional id
field which identifies an unique package. This field would be added to
the input and the output (and it could be string, so we can pass
hashsums).

Both also apply to CUDF itself, as far as I can tell.


--
Julian Andres Klode  - Debian Developer, Ubuntu Member

See http://wiki.debian.org/JulianAndresKlode and http://jak-linux.org/.



--
To UNSUBSCRIBE, email to deity-REQUEST@...
with a subject of "unsubscribe". Trouble? Contact listmaster@...


Re: RFC interaction with external dependency solver: Debian-CUDF

by David Kalnischkies-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Stefano & deity@,

2009/12/8 Stefano Zacchiroli <zack@...>:
> [ Explicit Cc to the people which expressed interest in this topic.
>  R-T/M-F-T set to deity@... as the most appropriate forum where to
>  discuss this. Let me know if I should keep the CC. ]
I guess the three are subscribed to deity as this is the official
dependency hell here where we search for the deity of dep-resolving. ;)

> I'm finally ready to propose a basic format and protocol for interaction
> between package managers and (external) dependency solvers. The format
> is called "Debian-CUDF" [4] and can describe solver input and
> output. The protocol, with a bit of fantasy, can be called "Debian-CUDF
> protocol".
A bit off:
A while back i read something about the "mancoosi" project and (i guess)
version 1 (The project seems to be now at v2). In this "old" version two
different formats where proposed DUDF and CUDF. Am i right that
"Debian-CUDF" is more or less the debian implementation of DUDF?

> The idea is that package managers, as soon as they face a dependency
> solving problem, will spawn an external process which reads from stdin a
> Debian-CUDF document and return on stdout a proposed solution describing
> the new set of installed packages.
So everything the Packagemanager has to do is asking the user for his
wishes, generate a Debian-CUDF, let the resolver do his job,
understand his solution and apply it, right?

The problem is therefore reduced to choosing the right sources for inclusion,
but is this not already a problem the resolver should take care of?
I mean, the CUDF specification seems to not include a way to say:
"Heh resolver, you can choose between 2 different versions while resolving,
but (if possible) please prefer this version over the other."
(basically what we love and hate here as apt_preferences or pinning in short)
So a packagemanager would need to choose wisely which versions he should
pass to the resolver.
Example: A user who explicitly say he want to install package A from
unstable should have the right to say also "resolve dependency B at all
costs, but if possible, choose the version from stable." Without resolving
on it's own a packagemanager can't say if he should include now the stable
version of B or the unstable version of B or both in the (Debian-)CUDF.
(imagine A would be satisfied by both, but a new Version of C requires
a specific.)

I can think of a few more ways to influence the resolver (e.g. prefer hold
over remove) which would be great to be included in a "standard" as the
whole concept of exchangeable resolvers suffers if i need to pass different
options to different resolvers to get the "same" result. Note to forget the
difference between people who want to install recommends/suggests or not.
(sry if this was what Stefano means with "Interactive solving" further
down in his mail).

> - Error reporting: currently there is no description about how the
>  solver reports error to the package manager (e.g. unsolvable
>  dependencies). It's rather easy to do, but requires consideration of
>  the possible cause of errors and how the package manager wants to
>  explain them to the user.
I think we should agree over some sort of exit-status error reporting:
The packagemanager than can display a common error message
depending on the exit-status and optional display a free form description
directly from the resolver what he provided instead of the solution.

As far as i can imagine we have in theory:
- an unavailable package
- a dependency on a too high/low version (not available ?)
- a try to install two conflicting packages
and in practice a wild mixture of all these problems at once resulting in
the uninstallability of one or more packages, so optional we could also
agree over some sort of format to say the packagemanager which
package could not be installed. A why would be interesting but i
think it is in general to complex to be easily describable and
will not help many users anyway, so this could be included in the
free form.

> - Solution description: the above returns solution in "here is the
>  new package status" style. Julian was instead proposing something like
>  "here is a diff from the previous package status". Mine looks cleaner
>  to me, but obviously more bloated.
I think ideally the resolver already knows what he has done to
achieve the "new" state (at least if he is not as greedy as APT ;) ),
therefore it should be easier for him to provide a status diff than
for the packagemanager to generate the diff and apply the diff.

> Comments on any of the above will be much appreciated!
I think Debian-CUDF should include an Architecture flag for Multi-Arch.
I am pretty sure Multi-Arch can be modeled in pure CUDF without
such a field but i think this will include quite a few conflict/break
dependencies and maybe a package rename to pkg-arch and we
already agree here over a simplification...


Best regards / Mit freundlichen Grüßen,

David "DonKult" Kalnischkies


--
To UNSUBSCRIBE, email to deity-REQUEST@...
with a subject of "unsubscribe". Trouble? Contact listmaster@...


Re: RFC interaction with external dependency solver: Debian-CUDF

by Julian Andres Klode-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Am Dienstag, den 08.12.2009, 15:03 +0100 schrieb David Kalnischkies:

> Hi Stefano & deity@,
>
> 2009/12/8 Stefano Zacchiroli <zack@...>:
> > [ Explicit Cc to the people which expressed interest in this topic.
> >  R-T/M-F-T set to deity@... as the most appropriate forum where to
> >  discuss this. Let me know if I should keep the CC. ]
> I guess the three are subscribed to deity as this is the official
> dependency hell here where we search for the deity of dep-resolving. ;)
>
> > I'm finally ready to propose a basic format and protocol for interaction
> > between package managers and (external) dependency solvers. The format
> > is called "Debian-CUDF" [4] and can describe solver input and
> > output. The protocol, with a bit of fantasy, can be called "Debian-CUDF
> > protocol".
> A bit off:
> A while back i read something about the "mancoosi" project and (i guess)
> version 1 (The project seems to be now at v2). In this "old" version two
> different formats where proposed DUDF and CUDF. Am i right that
> "Debian-CUDF" is more or less the debian implementation of DUDF?
>
> > The idea is that package managers, as soon as they face a dependency
> > solving problem, will spawn an external process which reads from stdin a
> > Debian-CUDF document and return on stdout a proposed solution describing
> > the new set of installed packages.
> So everything the Packagemanager has to do is asking the user for his
> wishes, generate a Debian-CUDF, let the resolver do his job,
> understand his solution and apply it, right?
>
> The problem is therefore reduced to choosing the right sources for inclusion,
> but is this not already a problem the resolver should take care of?
> I mean, the CUDF specification seems to not include a way to say:
> "Heh resolver, you can choose between 2 different versions while resolving,
> but (if possible) please prefer this version over the other."
> (basically what we love and hate here as apt_preferences or pinning in short)
> So a packagemanager would need to choose wisely which versions he should
> pass to the resolver.
> Example: A user who explicitly say he want to install package A from
> unstable should have the right to say also "resolve dependency B at all
> costs, but if possible, choose the version from stable." Without resolving
> on it's own a packagemanager can't say if he should include now the stable
> version of B or the unstable version of B or both in the (Debian-)CUDF.
> (imagine A would be satisfied by both, but a new Version of C requires
> a specific.)
>
> I can think of a few more ways to influence the resolver (e.g. prefer hold
> over remove) which would be great to be included in a "standard" as the
> whole concept of exchangeable resolvers suffers if i need to pass different
> options to different resolvers to get the "same" result. Note to forget the
> difference between people who want to install recommends/suggests or not.
> (sry if this was what Stefano means with "Interactive solving" further
> down in his mail).
I guess that's what MooML is for:
   http://www.mancoosi.org/papers/iwoce-2009-prefs.pdf
The creation of MooML is easy I think, but writing solvers supporting it
could be hard.

Interactive resolving means proposing the user a solution and the user
can accept it or request a new one (similar to aptitude); it could also
be enhanced to ask the user what he wants, e.g. "Do you want package A
or B?".

>
> > - Error reporting: currently there is no description about how the
> >  solver reports error to the package manager (e.g. unsolvable
> >  dependencies). It's rather easy to do, but requires consideration of
> >  the possible cause of errors and how the package manager wants to
> >  explain them to the user.
> I think we should agree over some sort of exit-status error reporting:
> The packagemanager than can display a common error message
> depending on the exit-status and optional display a free form description
> directly from the resolver what he provided instead of the solution.
I don't need differentiation between kinds of failures, my solver plans
have boolean return values.

>
> > - Solution description: the above returns solution in "here is the
> >  new package status" style. Julian was instead proposing something like
> >  "here is a diff from the previous package status". Mine looks cleaner
> >  to me, but obviously more bloated.
> I think ideally the resolver already knows what he has done to
> achieve the "new" state (at least if he is not as greedy as APT ;) ),
> therefore it should be easier for him to provide a status diff than
> for the packagemanager to generate the diff and apply the diff.
Exactly, especially since package managers normally work on a concept of
changesets; except for APT which has a more or less global
change-the-state-to-what-you-want system.

>
> > Comments on any of the above will be much appreciated!
> I think Debian-CUDF should include an Architecture flag for Multi-Arch.
> I am pretty sure Multi-Arch can be modeled in pure CUDF without
> such a field but i think this will include quite a few conflict/break
> dependencies and maybe a package rename to pkg-arch and we
> already agree here over a simplification...
As I wrote. But we also have to take into account that according to the
multi-arch spec, only certain packages support it.


--
Julian Andres Klode  - Debian Developer, Ubuntu Member

See http://wiki.debian.org/JulianAndresKlode and http://jak-linux.org/.



--
To UNSUBSCRIBE, email to deity-REQUEST@...
with a subject of "unsubscribe". Trouble? Contact listmaster@...


Re: RFC interaction with external dependency solver: Debian-CUDF

by Stefano Zacchiroli :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, Dec 08, 2009 at 02:48:47PM +0100, Julian Andres Klode wrote:
> > - Error reporting: currently there is no description about how the
> >   solver reports error to the package manager (e.g. unsolvable
> >   dependencies). It's rather easy to do, but requires consideration of
> >   the possible cause of errors and how the package manager wants to
> >   explain them to the user.
> I recommend as simple text message response encoded in the preamble
> I suggest a response consisting of a preamble with an additional
> property describing the cause of the error.

In fact, it would be a bit more complex than that. Or, better, we can
start like that, but then we will need some more structured information
(in particular for the interactive part). In particular, when a request
is not addressable, we will at list need to distinguish among the
following cases (there might be more):

- a package is missing
- a conflict among n packages is in place

Anyhow, let's say we start with something like the following, which is
returned by the package in case of errors:

  error:
  message: some explanatory text
   possibly continuing there

That way we preserve the structure of 822-like format and we can add
more structured information later on.

Note that CUDF strings at the moment do not contain newlines (line
continuation are joined together suppressing newlines and leading white
spaces). So to structure message we would need some different kind of
convention.

> > - Solution description: the above returns solution in "here is the
> >   new package status" style. Julian was instead proposing something like
> >   "here is a diff from the previous package status". Mine looks cleaner
> >   to me, but obviously more bloated.
> It might look cleaner, but obviously requires more data. It also does
> not really match my plans which are based on changes instead of target
> state (as partially described in 20091207172049.GA26830@...).

I agree it will require more data and I was aware of your plans: still
what I posted is what I had ready and then I posted it nevertheless :-)
Let's try to look at the actual merits.

The required data are anyhow comparable in size to the data sent as
solver input, whether it will be a bottleneck or not is something which
we don't know yet. If it turns out to be very cheap, it might be an
useless optimization.

Also, regarding the matching with your plans, AFAIU APT2 will anyhow
play man-in-the-middle between the solver and the low-level package
manager. Are you trying to use an isomorphic format so that you can
simply pipe solver output to low-level package manager?

Final note (that partly explain my proposal): for most solving
techniques, the output I proposed is closer to what they internally
have. Having a diff there would mean that they (or an intermediate
pipeline component) have to compute the "diff" to generate the output
you want. The question is then who should better compute the "diff": the
solver or the package manager?

Anyhow, I don't have a strong opinion on this, I'd be fine with a diff
too, I'd just like to reach an agreement among involved parties. Can you
please suggest a 822-based diff format more or less in line with the
syntax I've shown?

> I am missing multi-arch support, which is important for the future. I
> don't really know about the status of the multiarch spec, but I would
> like to see it supported. A package stanza in the solver output would
> currently identify more than one exact package (it would identify the
> package with the given version on all architectures).

Unfortunately, I'm not familiar with the status of the multiarch spec
either. Can someone point us at which impact will multiarch have in
terms of dependency solving?

> Another thing is that there can be two packages with different
> dependencies, but the same name and version. I propose an optional id
> field which identifies an unique package. This field would be added to
> the input and the output (and it could be string, so we can pass
> hashsums).

Right, good point.  This can be a bit tricky as currently the only (and
forcibly necessary) package identifier in CUDF model is <package,
version>. The id would not work though, consider the following example:

  package: a
  version: 1
  depends: x

  package: a
  version: 1
  depends: y

  package: a
  version: 2
  depends: z

  package: b
  version: 1
  depends: a

In expanding the dependencies of b, solvers will internally consider
them as something like "depends: a = 1 | a = 2", but here such a spec
would not be enough because which <a,1> is chosen makes a difference.

So, my actual solution is to use an id as you propose, but to mangle the
package name or the version with it, e.g.: a%id1, a%id2, ...

How often does that happen in practice?
How is it currently handled in apt and friends?


Thanks for your comments!
Cheers.

--
Stefano Zacchiroli -o- PhD in Computer Science \ PostDoc @ Univ. Paris 7
zack@{upsilon.cc,pps.jussieu.fr,debian.org} -<>- http://upsilon.cc/zack/
Dietro un grande uomo c'è ..|  .  |. Et ne m'en veux pas si je te tutoie
sempre uno zaino ...........| ..: |.... Je dis tu à tous ceux que j'aime


signature.asc (197 bytes) Download Attachment

Re: RFC interaction with external dependency solver: Debian-CUDF

by Stefano Zacchiroli :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, Dec 08, 2009 at 03:03:14PM +0100, David Kalnischkies wrote:
> > [ Explicit Cc to the people which expressed interest in this topic.
> >  R-T/M-F-T set to deity@... as the most appropriate forum where to
> >  discuss this. Let me know if I should keep the CC. ]
> I guess the three are subscribed to deity as this is the official
> dependency hell here where we search for the deity of dep-resolving. ;)

OK, let's drop all Cc-s then :-)

> A while back i read something about the "mancoosi" project and (i guess)
> version 1 (The project seems to be now at v2). In this "old" version two
> different formats where proposed DUDF and CUDF. Am i right that
> "Debian-CUDF" is more or less the debian implementation of DUDF?

In principle yes, but the purposes are a bit different.
Short story: ignore DUDF, we won't use it here.

Long story: DUDF is meant to submit problem from user machines to a
central repository over the net (a la popcon). So, in that format, we do
support optimizations like sending a checksum of the APT package lists,
so that the receiver looks up the checksums in a historical database and
expand them. Here we don't care about this, because the solvers really
need full bloated data.  Also, the DUDF format in fact is very
under-defined and does not have a clear semantics, it is just an XML
skeleton and not much more.

> So everything the Packagemanager has to do is asking the user for his
> wishes, generate a Debian-CUDF, let the resolver do his job,
> understand his solution and apply it, right?

Yes, written this way it looks easy :-), but the details are still
tricky (as I'm sure this thread will show :-)).

> The problem is therefore reduced to choosing the right sources for inclusion,
> but is this not already a problem the resolver should take care of?

Julian here is right that that role will be fulfilled by MooML (which we
don't have yet). Ideally, with CUDF you just express the user request
and the format semantics ensure you don't return unsound solutions
(i.e. solutions that violate dependency constraints).

Additionally (by using another property in the "request:" stanza I've
already shown) you can specify extra constraints (e.g. don't install
package maintained by "Foo Bar" because I don't trust him) or
optimization criteria (e.g. among all possible solutions, choose the one
which minimize the installed-size).

Pinning can be expressed in such a way.

You might be wonder that this divide and conquer approach is a bit too
complex, but in fact proper dependency solving is *terribly* complex and
I'm trying to address (from package manager engineering point of view)
one concern at a time. Still, we've already thought about these kind of
issues and we haven't simply ignored them. We believe they can be
plugged in one step at a time. The link about MooML given by Julian is a
good pointer to get the "big picture" all at once.

> I think we should agree over some sort of exit-status error reporting:
> The packagemanager than can display a common error message
> depending on the exit-status and optional display a free form description
> directly from the resolver what he provided instead of the solution.

With my proposal of returning an "error: " stanza, I believe that we can
consider that "ordinary" error scenarii will be handled irrespectively
of the exit code (or, if you want, by relying on always 0 exit code). If
we agree on that, all other exit codes are very exceptional situation
that can be described as "can't start the dependency solver".

> - an unavailable package
> - a dependency on a too high/low version (not available ?)
> - a try to install two conflicting packages
> and in practice a wild mixture of all these problems at once resulting in
> the uninstallability of one or more packages, so optional we could also
> agree over some sort of format to say the packagemanager which
> package could not be installed. A why would be interesting but i

Exactly: the error response is in fact a complex object, so I believe we
should push most of its details in the return format rather than using a
plethora of different exit codes. YMMV.

> I think ideally the resolver already knows what he has done to
> achieve the "new" state (at least if he is not as greedy as APT ;) ),
> therefore it should be easier for him to provide a status diff than
> for the packagemanager to generate the diff and apply the diff.

Well, a lot of solvers we can apply here have in fact a global vision of
all constraints and don't work by changing the current status, but
rather finding a satisfactory solution. That's why they are closer to
the "new state" approach than to the "diff" one, but that's not a big
deal: for such solvers we can compute the diff before sending it to APT
(or equivalent).

Anyone around which actually would prefer the "new status" approach?  If
there isn't anyone, I guess the consensus is quite clear already :-)

Cheers.

--
Stefano Zacchiroli -o- PhD in Computer Science \ PostDoc @ Univ. Paris 7
zack@{upsilon.cc,pps.jussieu.fr,debian.org} -<>- http://upsilon.cc/zack/
Dietro un grande uomo c'è ..|  .  |. Et ne m'en veux pas si je te tutoie
sempre uno zaino ...........| ..: |.... Je dis tu à tous ceux que j'aime


signature.asc (197 bytes) Download Attachment

Re: RFC interaction with external dependency solver: Debian-CUDF

by Julian Andres Klode-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Am Dienstag, den 08.12.2009, 16:02 +0100 schrieb Stefano Zacchiroli:

> On Tue, Dec 08, 2009 at 02:48:47PM +0100, Julian Andres Klode wrote:
> > > - Error reporting: currently there is no description about how the
> > >   solver reports error to the package manager (e.g. unsolvable
> > >   dependencies). It's rather easy to do, but requires consideration of
> > >   the possible cause of errors and how the package manager wants to
> > >   explain them to the user.
> > I recommend as simple text message response encoded in the preamble
> > I suggest a response consisting of a preamble with an additional
> > property describing the cause of the error.
>
> In fact, it would be a bit more complex than that. Or, better, we can
> start like that, but then we will need some more structured information
> (in particular for the interactive part). In particular, when a request
> is not addressable, we will at list need to distinguish among the
> following cases (there might be more):
>
> - a package is missing
> - a conflict among n packages is in place
>
> Anyhow, let's say we start with something like the following, which is
> returned by the package in case of errors:
>
>   error:
>   message: some explanatory text
>    possibly continuing there
>
> That way we preserve the structure of 822-like format and we can add
> more structured information later on.
>
> Note that CUDF strings at the moment do not contain newlines (line
> continuation are joined together suppressing newlines and leading white
> spaces). So to structure message we would need some different kind of
> convention.
Like for debian/control, line with space and dot. We can use exactly the
same format as we use for package descriptions.

>
> > > - Solution description: the above returns solution in "here is the
> > >   new package status" style. Julian was instead proposing something like
> > >   "here is a diff from the previous package status". Mine looks cleaner
> > >   to me, but obviously more bloated.
> > It might look cleaner, but obviously requires more data. It also does
> > not really match my plans which are based on changes instead of target
> > state (as partially described in 20091207172049.GA26830@...).
>
> I agree it will require more data and I was aware of your plans: still
> what I posted is what I had ready and then I posted it nevertheless :-)
> Let's try to look at the actual merits.
>
> The required data are anyhow comparable in size to the data sent as
> solver input, whether it will be a bottleneck or not is something which
> we don't know yet. If it turns out to be very cheap, it might be an
> useless optimization.
>
> Also, regarding the matching with your plans, AFAIU APT2 will anyhow
> play man-in-the-middle between the solver and the low-level package
> manager. Are you trying to use an isomorphic format so that you can
> simply pipe solver output to low-level package manager?
Now, but I create the new objects for each change, using a subset of the
attributes needed for the request.

>
> Final note (that partly explain my proposal): for most solving
> techniques, the output I proposed is closer to what they internally
> have. Having a diff there would mean that they (or an intermediate
> pipeline component) have to compute the "diff" to generate the output
> you want. The question is then who should better compute the "diff": the
> solver or the package manager?
Creating the diff in the solver would be more efficient for some cases
where the solver somehow has a diff, but as efficient as computing the
diff in the package manager if the solver works with the state approach.
This means  it is at least as efficient as computing it in the package
manager.

>
> Anyhow, I don't have a strong opinion on this, I'd be fine with a diff
> too, I'd just like to reach an agreement among involved parties. Can you
> please suggest a 822-based diff format more or less in line with the
> syntax I've shown?
Just return something similar to the request stanze or several stanzas
in the form:

    package: name
    version: version
    action: <install/remove/upgrade..>

In this case, "action" would give the choice between values like
install, remove, upgrade (just like the enum I wrote in the other
message I refered to).

>
> > I am missing multi-arch support, which is important for the future. I
> > don't really know about the status of the multiarch spec, but I would
> > like to see it supported. A package stanza in the solver output would
> > currently identify more than one exact package (it would identify the
> > package with the given version on all architectures).
>
> Unfortunately, I'm not familiar with the status of the multiarch spec
> either. Can someone point us at which impact will multiarch have in
> terms of dependency solving?
The thing is specified on https://wiki.ubuntu.com/MultiarchSpec, if you
want to have a look at it.

>
> > Another thing is that there can be two packages with different
> > dependencies, but the same name and version. I propose an optional id
> > field which identifies an unique package. This field would be added to
> > the input and the output (and it could be string, so we can pass
> > hashsums).
>
> Right, good point.  This can be a bit tricky as currently the only (and
> forcibly necessary) package identifier in CUDF model is <package,
> version>. The id would not work though, consider the following example:
>
>   package: a
>   version: 1
>   depends: x
>
>   package: a
>   version: 1
>   depends: y
>
>   package: a
>   version: 2
>   depends: z
>
>   package: b
>   version: 1
>   depends: a
>
> In expanding the dependencies of b, solvers will internally consider
> them as something like "depends: a = 1 | a = 2", but here such a spec
> would not be enough because which <a,1> is chosen makes a difference.
I just want to know the exact package afterwards, so I can download the
correct one. And with an ID, I would know which <a,1> to install.

>
> So, my actual solution is to use an id as you propose, but to mangle the
> package name or the version with it, e.g.: a%id1, a%id2, ...
>
> How often does that happen in practice?
> How is it currently handled in apt and friends?
PackageKit uses this one:
http://www.packagekit.org/gtk-doc/concepts.html#introduction-ideas-packageid
, but I guess this is also to incomplete for those cases.

--
Julian Andres Klode  - Debian Developer, Ubuntu Member

See http://wiki.debian.org/JulianAndresKlode and http://jak-linux.org/.



--
To UNSUBSCRIBE, email to deity-REQUEST@...
with a subject of "unsubscribe". Trouble? Contact listmaster@...


Re: RFC interaction with external dependency solver: Debian-CUDF

by Eugene V. Lyubimkin-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hello thread.

Stefano Zacchiroli wrote:
> [ Explicit Cc to the people which expressed interest in this topic.
>   R-T/M-F-T set to deity@... as the most appropriate forum where to
>   discuss this. Let me know if I should keep the CC. ]
Thanks. Yes, I'm subscribed to deity- :)

> - Error reporting: currently there is no description about how the
>   solver reports error to the package manager (e.g. unsolvable
>   dependencies). It's rather easy to do, but requires consideration of
>   the possible cause of errors and how the package manager wants to
>   explain them to the user.
Some error stanzas were proposed in the thread, look fine for me.

> - Interactive solving: above I've implicitly assumed that solving is
>   always batch, which is not the case. I've a basic interaction protocol
>   that addresses this, but I would first like to agree on the batch case
>   as most of the above will be reused in the interactive case.
Skipping that before other things got agreed on...

> - Performances: I've the feeling that a big pipe of plain text can be a
>   bottleneck; we can image a binary equivalent, but first it'd be better
>   to have numbers about the actual impact of the pipe.
For this my position just as before: not a problem :)

> - Implementations status:
>   * Eugene used to have a preliminary implementation of this in CUPT,
>     then due to my delay it has not been maintained. Eugene: do you
>     think it would be easy to port the implementation to the above
>     format?
I maintained that external resolver wrapper in Cupt so it should work for the
that copy of CUDF protocol you had when I wrote my last mail in those our
thread some months ago. Can you provide a formal or informal list of changes
in CUDF so I don't need to read all two documents fully to find the protocol
changes since that time?

>   * Julian was waiting for me on a format proposal: here we go! :-)
>
> - Solution description: the above returns solution in "here is the
>   new package status" style. Julian was instead proposing something like
>   "here is a diff from the previous package status". Mine looks cleaner
>   to me, but obviously more bloated.
I disagree. In case we upgrade something like 2/3 of the system and more, the
diff approach is even bigger, even without a fact it's more complicated
without any practical gain IMO.

--
Eugene V. Lyubimkin aka JackYF, JID: jackyf.devel(maildog)gmail.com
C++/Perl developer, Debian Developer



signature.asc (205 bytes) Download Attachment

Re: RFC interaction with external dependency solver: Debian-CUDF

by Eugene V. Lyubimkin-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Stefano Zacchiroli wrote:
>> Another thing is that there can be two packages with different
>> dependencies, but the same name and version. I propose an optional id
>> field which identifies an unique package. This field would be added to
>> the input and the output (and it could be string, so we can pass
>> hashsums).
>
> Right, good point.  This can be a bit tricky as currently the only (and
> forcibly necessary) package identifier in CUDF model is <package,
> version>. The id would not work though, consider the following example:
[...]
> How is it currently handled in apt and friends?
Cupt uniquely identifies the version entry by the pair (package_name,
version). Hence, I disagree with adding some additional 'id's or something to
the spec, it's ugly and not necessary IMO.

--
Eugene V. Lyubimkin aka JackYF, JID: jackyf.devel(maildog)gmail.com
C++/Perl developer, Debian Developer

--
Eugene V. Lyubimkin aka JackYF, JID: jackyf.devel(maildog)gmail.com
C++/Perl developer, Debian Developer



signature.asc (205 bytes) Download Attachment

Re: RFC interaction with external dependency solver: Debian-CUDF

by Eugene V. Lyubimkin-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Stefano Zacchiroli wrote:
>> Another thing is that there can be two packages with different
>> dependencies, but the same name and version. I propose an optional id
>> field which identifies an unique package. This field would be added to
>> the input and the output (and it could be string, so we can pass
>> hashsums).
>
> Right, good point.  This can be a bit tricky as currently the only (and
> forcibly necessary) package identifier in CUDF model is <package,
> version>. The id would not work though, consider the following example:
[...]
> How is it currently handled in apt and friends?
Cupt uniquely identifies the version entry by the pair (package_name,
version). Hence, I disagree with adding some additional 'id's or something to
the spec, it's ugly and not necessary IMO.

--
Eugene V. Lyubimkin aka JackYF, JID: jackyf.devel(maildog)gmail.com
C++/Perl developer, Debian Developer



signature.asc (205 bytes) Download Attachment

Re: RFC interaction with external dependency solver: Debian-CUDF

by Eugene V. Lyubimkin-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Stefano Zacchiroli wrote:
> Anyone around which actually would prefer the "new status" approach?  If
> there isn't anyone, I guess the consensus is quite clear already :-)
As I already told in past private discussions, I prefer "new status" approach
instead of diff. External resolver provides the right (in the sense of
dependencies) target set of packages. So, the result should be IMO set of
packages, not something else.

--
Eugene V. Lyubimkin aka JackYF, JID: jackyf.devel(maildog)gmail.com
C++/Perl developer, Debian Developer



signature.asc (205 bytes) Download Attachment

Re: RFC interaction with external dependency solver: Debian-CUDF

by Stefano Zacchiroli :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, Dec 08, 2009 at 06:21:09PM +0100, Julian Andres Klode wrote:
> > Note that CUDF strings at the moment do not contain newlines (line
> > continuation are joined together suppressing newlines and leading white
> > spaces). So to structure message we would need some different kind of
> > convention.
> Like for debian/control, line with space and dot. We can use exactly the
> same format as we use for package descriptions.

No, not really, deb822 has a notion of multi-line stanza AFAIR; without
that it would be impossible, for instance, to distinguish between short
and long descriptions. cudf822 does not have anything like to KISS, but
it's not a big deal.

> > Final note (that partly explain my proposal): for most solving
> > techniques, the output I proposed is closer to what they internally
> > have. Having a diff there would mean that they (or an intermediate
> > pipeline component) have to compute the "diff" to generate the output
> > you want. The question is then who should better compute the "diff": the
> > solver or the package manager?
> Creating the diff in the solver would be more efficient for some cases
> where the solver somehow has a diff, but as efficient as computing the
> diff in the package manager if the solver works with the state approach.
> This means  it is at least as efficient as computing it in the package
> manager.
Agreed. Still, I believe we really want consensus here, Eugene has
spoken for the "new state" format, and I'm not sure I want to end up
with an initial message of the solver requesting which output it wants.

What is the take of current apt-get/aptitude developers on this?
Would you prefer diff or "new state" approach?

> Just return something similar to the request stanze or several stanzas
> in the form:
>
>     package: name
>     version: version
>     action: <install/remove/upgrade..>
>
> In this case, "action" would give the choice between values like
> install, remove, upgrade (just like the enum I wrote in the other
> message I refered to).
That's a reasonable possibility. Still, I've a few additional remarks
that should be clear *if* we go that way.

The only meaningful action for a CUDF-based diff are install/remove,
upgrade at CUDF level is just a combination of a removal and an install
(the difference is important, as CUDF also supports package managers
where you can install multiple versions of the same package and there
"upgrade" has no clear semantics).

The *order* in which install/remove should be performed is not something
CUDF knows about, i.e. the dependency loop breaking logics is not
currently handled. That will remain in the package manager anyhow. Are
we OK with that?

> > Unfortunately, I'm not familiar with the status of the multiarch spec
> > either. Can someone point us at which impact will multiarch have in
> > terms of dependency solving?
> The thing is specified on https://wiki.ubuntu.com/MultiarchSpec, if you
> want to have a look at it.

OK, will do and get back to that as soon as I'm more knowledgeable on
the topic. In the meantime I suggest we continue considering the
mono-arch case.

> > > Another thing is that there can be two packages with different
> > > dependencies, but the same name and version. I propose an optional id
<snip>
> I just want to know the exact package afterwards, so I can download the
> correct one. And with an ID, I would know which <a,1> to install.

Yep, still the numerical impact / frequency of such cases are important
to understand how to deal with this. If, as I hope, is not the relevant,
I would prefer to handle it by package name mangling which is performed
in the package manager *before* feeding data into the solver.

For instance a package manager can use "a,1" as the first identifier for
package a at version 1 it encounters, and then for the subsequent
"a%id1,1" and so on. Note that this opens a whole lot of problems that
the package manager will need to solve anyhow, e.g.:

- which reference to "a,1" should be rewritten?
  - all those in the same package universe?
  - all of them using | ?

even considering all "a,1" as equal has problems: which policy do you
use to choose which one in the end you want to install? is that policy
clear to the user?

In fact, I believe this requirement is very ugly, after all we are
trying to build coherent distributions where it is reasonable to expect
package name/versions are unique. The only reasonable case is where the
locally installed package is different from one from the distro with the
same version. But AFAICT current apt-get handles this case by always
reinstalling such a package with the same name/version package from the
archive, which is a reasonable behavior.

> > How often does that happen in practice?
> > How is it currently handled in apt and friends?
> PackageKit uses this one:
> http://www.packagekit.org/gtk-doc/concepts.html#introduction-ideas-packageid
> , but I guess this is also to incomplete for those cases.

Well, in general I believe PackageKit is very naive about a lot
(i.e. too many) issues related to dependency handling, for instance the
whole idea of "we don't need conflicts" looks very naive to me. Anyhow
what that page does not tell is what dependencies mean with respect to
that "packageid". For instance is a dependency on package "a" matched by
all packages called "a" no matter their data?

Cheers.

--
Stefano Zacchiroli -o- PhD in Computer Science \ PostDoc @ Univ. Paris 7
zack@{upsilon.cc,pps.jussieu.fr,debian.org} -<>- http://upsilon.cc/zack/
Dietro un grande uomo c'è ..|  .  |. Et ne m'en veux pas si je te tutoie
sempre uno zaino ...........| ..: |.... Je dis tu à tous ceux que j'aime


signature.asc (197 bytes) Download Attachment

Re: RFC interaction with external dependency solver: Debian-CUDF

by Stefano Zacchiroli :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, Dec 08, 2009 at 08:33:49PM +0200, Eugene V. Lyubimkin wrote:
> Cupt uniquely identifies the version entry by the pair (package_name,
> version). Hence, I disagree with adding some additional 'id's or something to
> the spec, it's ugly and not necessary IMO.

So, how does cupt currently handle locally installed packages which
might have been rebuilt by users (which forgot to change the version
number)?  Are you able to decide whether they should be kept or not?  Do
you reinstall them anyhow with the version from the archive?

Cheers.

--
Stefano Zacchiroli -o- PhD in Computer Science \ PostDoc @ Univ. Paris 7
zack@{upsilon.cc,pps.jussieu.fr,debian.org} -<>- http://upsilon.cc/zack/
Dietro un grande uomo c'è ..|  .  |. Et ne m'en veux pas si je te tutoie
sempre uno zaino ...........| ..: |.... Je dis tu à tous ceux que j'aime


signature.asc (197 bytes) Download Attachment

Re: RFC interaction with external dependency solver: Debian-CUDF

by Stefano Zacchiroli :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, Dec 08, 2009 at 08:33:15PM +0200, Eugene V. Lyubimkin wrote:
> I maintained that external resolver wrapper in Cupt so it should work for the
> that copy of CUDF protocol you had when I wrote my last mail in those our
> thread some months ago. Can you provide a formal or informal list of changes
> in CUDF so I don't need to read all two documents fully to find the protocol
> changes since that time?

Ah, well, the CUDF pdf spec [1] has a full "changelog" section already,
but is a diff between CUDF 1.0 and CUDF 2.0. I suggest starting with a
more pragmatic approach: send me (in private mail) a sample output of
what cupt would currently send to the external package manager and I'll
pinpoint needed adjustments. Deal? :)

Cheers.

[1] http://www.mancoosi.org/reports/tr3.pdf

--
Stefano Zacchiroli -o- PhD in Computer Science \ PostDoc @ Univ. Paris 7
zack@{upsilon.cc,pps.jussieu.fr,debian.org} -<>- http://upsilon.cc/zack/
Dietro un grande uomo c'è ..|  .  |. Et ne m'en veux pas si je te tutoie
sempre uno zaino ...........| ..: |.... Je dis tu à tous ceux que j'aime


signature.asc (197 bytes) Download Attachment

Re: RFC interaction with external dependency solver: Debian-CUDF

by Julian Andres Klode-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Am Dienstag, den 08.12.2009, 20:33 +0200 schrieb Eugene V. Lyubimkin:

>
> >   * Julian was waiting for me on a format proposal: here we go! :-)
> >
> > - Solution description: the above returns solution in "here is the
> >   new package status" style. Julian was instead proposing something like
> >   "here is a diff from the previous package status". Mine looks cleaner
> >   to me, but obviously more bloated.
> I disagree. In case we upgrade something like 2/3 of the system and more, the
> diff approach is even bigger, even without a fact it's more complicated
> without any practical gain IMO.
We have to tell dpkg remove this package/install this package, and not
"bring us to the following state". Thus, I think that receiving those
actions from the solver makes things easier, and not more complicated.

And if one uses a database as the cache in the package manager,
generating the diff in the PM is certainly less efficient than
generating it in the solver which should have the information in-memory,
whereas the PM has to do a database lookup first, selecting all
installed packages and passing this forward.

I have not decided about cache storage yet, candidates are:
    1. (bin) SQLite3 database backend
    2. (bin) custom binary format (or a binary implementation of CUDF)
    3. (txt) storage in CUDF-derived format
    4. (txt) storage in Debian format
    5. (txt) storage in JSON (JavaScript Object Notation)

Obviously, option 3 seems to be the best for external solvers, as we
could just pass them the cache directly. Option 5 also seems
interesting, given that there already is libpersistence on
git.freesmartphone.org[1]. But it currently takes about 2 seconds to
load a JSON cache for Debian sid; which is not very efficient. And all
data would be kept in-memory and dumped to the disk on exit. The SQLite
version is best suited for the internal solvers, as they can use the
power of the database to find the packages they are looking for (e.g.
SELECT * from packages where name='%s' and architecture='%s').

But as I already told some people, I believe that interface comes first
and afterwards we care about how to store the data.

[1] http://git.freesmartphone.org/?p=libpersistence.git;a=summary

--
Julian Andres Klode  - Debian Developer, Ubuntu Member

See http://wiki.debian.org/JulianAndresKlode and http://jak-linux.org/.



--
To UNSUBSCRIBE, email to deity-REQUEST@...
with a subject of "unsubscribe". Trouble? Contact listmaster@...


Re: RFC interaction with external dependency solver: Debian-CUDF

by Eugene V. Lyubimkin :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Stefano Zacchiroli wrote:
> On Tue, Dec 08, 2009 at 08:33:49PM +0200, Eugene V. Lyubimkin wrote:
>> Cupt uniquely identifies the version entry by the pair (package_name,
>> version). Hence, I disagree with adding some additional 'id's or something to
>> the spec, it's ugly and not necessary IMO.
>
> So, how does cupt currently handle locally installed packages which
> might have been rebuilt by users (which forgot to change the version
> number)?
If they are binary equivalent, they are merged. Otherwise non-local packages
are thrown out with a warning.

--
Eugene V. Lyubimkin aka JackYF, JID: jackyf.devel(maildog)gmail.com
C++/Perl developer, Debian Developer



signature.asc (205 bytes) Download Attachment

Re: RFC interaction with external dependency solver: Debian-CUDF

by Eugene V. Lyubimkin :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Julian Andres Klode wrote:

> Am Dienstag, den 08.12.2009, 20:33 +0200 schrieb Eugene V. Lyubimkin:
>
>>>   * Julian was waiting for me on a format proposal: here we go! :-)
>>>
>>> - Solution description: the above returns solution in "here is the
>>>   new package status" style. Julian was instead proposing something like
>>>   "here is a diff from the previous package status". Mine looks cleaner
>>>   to me, but obviously more bloated.
>> I disagree. In case we upgrade something like 2/3 of the system and more, the
>> diff approach is even bigger, even without a fact it's more complicated
>> without any practical gain IMO.
> We have to tell dpkg remove this package/install this package, and not
> "bring us to the following state". Thus, I think that receiving those
> actions from the solver makes things easier, and not more complicated.
No. Consider the variant when some package is in the stage 'unpacked'. You'll
have to pass all the installed packages' states before scheduling actions
anyway. And the external resolver has nothing to do with dpkg.

--
Eugene V. Lyubimkin aka JackYF, JID: jackyf.devel(maildog)gmail.com
C++/Perl developer, Debian Developer



signature.asc (205 bytes) Download Attachment
< Prev | 1 - 2 | Next >