Release testing and the relationship between 'bzr selftest' and plugins

View: New views
20 Messages — Rating Filter:   Alert me  
< Prev | 1 - 2 - 3 | Next >

Release testing and the relationship between 'bzr selftest' and plugins

by Jelmer Vernooij :: Rate this Message:

| View Threaded | Show Only this Message

TL;DR: Nothing proves that plugins are compatible with a particular
   version of bzr like running the full testsuite, with all plugins
   installed. I think we should run the testsuite often - with all
   plugins installed - and make sure it passes.

We have recently had a few issues with Windows and Mac packages
bundling plugins that were incompatible. There have been a
fair number of similar issues with packaged (individual) plugins in
Debian/Ubuntu in the past. I'm wondering what we can do to reduce
this skew, and its impact.

The API versioning infrastructure doesn't work
----------------------------------------------

In the past, we've tried to give plugins the ability to declare
what bzr versions they support. The problem with this is that
it only uses a single tuple to represent the entire public bzrlib API,
of which plugins usually only use a tiny amount. We've tweaked the
way this works several times, and it doesn't get in the way too much
at the moment, but it also doesn't really help in detecting
incompatibilities.

If we change the version tuple for every change that could
potentially break backwards compatibility somewhere,
plugin authors are constantly updating the list of bzr versions
their plugin supports. If we only update it for major changes (what
are major changes?) we risk breaking plugins without noticing.

We can use separate version tuples for for different bzr subsystems,
but that is a pain, both to keep up to date in bzrlib and for plugin
authors to check. http://pad.lv/742192

Similarly, plugin authors can't predict whether a future API change is
going to affect their plugin. It's impossible to say which future
bzrlib version is going to break your plugin. http://pad.lv/704238

As somebody (I think poolie?) said earlier, we should probably limit API
version checking in plugins to:

 * If applicable, is the bzrlib version known to be too old
 * If applicable, is the bzrlib version known to be too new

Plugins tightly coupled to bzr core
-----------------------------------

Some plugins have a fairly tight relationship with Bazaar because they
tie into it in so many places. Changing the bzrlib core isn't always
(or at reasonable cost) possible in a way that doesn't break the plugins.
This is especially true of the foreign branch plugins (bzr-svn,
bzr-git, bzr-hg).

It would be also be nice to be able to blacklist certain versions
of plugins that we know we're breaking, when we make changes. This is
bug http://pad.lv/742196

As the maintainer of three foreign branch plugins, I run their
testsuites regularly, and usually notice when there is an incompatible
change in bzrlib.  I think I've done a reasonable job of keeping
versions available that are compatible with all releases.

However, I do a pretty poor job of keeping everybody else informed
about these kinds of fixes, and when a new version should be shipped.
I'll try to be more vocal about new releases, and send more
announcements to the bazaar-announce list.

"bzr selftest" doesn't pass with a standard set of plugins installed
--------------------------------------------------------------------

At the moment, it is not feasible to run "bzr selftest" on a system
with some of the standard plugins installed, without getting lots of
test failures.

Plugins plug into bzr and register new formats, commands and hooks.
This affects other parts of Bazaar. This can break core bzr tests.
Sometimes this is with good reason - tests
that verify the formatting of option descriptions break when plugins
register new commands with invalid descriptions. And sometimes it is
unintentional - a plugin that hooks into status and tweaks its output
can break the 'bzr status' blackbox tests.

I really think we should fix this.

No regular testing of plugins against bzr.dev
---------------------------------------------

"bzr selftest" gets run very often for "bzr" itself with just the bundled
plugins (launchpad, changelog_merge, ...) - on PQM, various platforms
in babune, on developer machines, etc.

However, we infreuqently run the testsuites of plugins against
bzr.dev (usually only when we are actively changing the plugins), and
even less frequently we run the tests outside of
bzrlib.plugins.PLUGINNAME.tests that are relevant for the plugin.

Once we fix the previous issue, I'm sure more developers will start
running more of the tests. Perhaps it would also be possible to
have a babune slave run the tests for all plugin trunks against
bzr.dev?

No 'bzr selftest' run with the bzr and plugins shipped in an installer
---------------------------------------------------------------------

This also related to the previous two points. At the
moment, we don't run the full testsuite, including plugins, for the
combination of bzr and plugins that is going to be shipped in an
installer.

Once we have a working "bzr selftest", this should be easy to do. And
we're less likely to find issues at install time if the full testsuite
is already being run regularly.  Of course, it will slow down the
release process somewhat, having to wait for the full testsuite for
bzr core and all plugins to pass and all.

Thoughts?

Cheers,

Jelmer


Re: Release testing and the relationship between 'bzr selftest' and plugins

by Vincent Ladeuil-2 :: Rate this Message:

| View Threaded | Show Only this Message

>>>>> Jelmer Vernooij <jelmer@...> writes:

<snip/>

    > The API versioning infrastructure doesn't work
    > ----------------------------------------------

Sad but true.

This means the efforts we put into maintaining compatibility with
plugins are partly wasted though.

I can think of several ways for plugins to address the issue:

1 - merge plugins into core, making them core plugins: we've talked
    about that long ago but nothing happened. The main benefit is that
    such plugins will be better maintained since a change in core will
    be noticed more quickly. This add maintenance costs to bzr core
    which is mitigated by less efforts on compatibility in both bzrlib
    and the plugins.

2 - push plugin authors to create series targeted at bzr releases: avoid
    many maintenance issues :) This will also help installer
    builders/packagers.

    >  * If applicable, is the bzrlib version known to be too old
    >  * If applicable, is the bzrlib version known to be too new

Fair enough, but it's up to the plugin authors to do that.

    > Plugins tightly coupled to bzr core
    > -----------------------------------

(1) applies there (as well as 2 to a lesser degree), but again, it's up
to the plugin authors.

    > It would be also be nice to be able to blacklist certain versions
    > of plugins that we know we're breaking, when we make changes. This
    > is bug http://pad.lv/742196

This sounds like something we can easily add support for and will be
valuable for users.

    > As the maintainer of three foreign branch plugins, I run their
    > testsuites regularly, and usually notice when there is an
    > incompatible change in bzrlib.  I think I've done a reasonable job
    > of keeping versions available that are compatible with all
    > releases.

Thanks for that !

A related issue there is the test IDs name space, some tests can be
inherited by plugins so that 'bzr selftest -s bp.<plugin>' will include
them, some don't.

There are ways to make our tests easier to be reused by plugins but
we're not there yet:

- make the test parametrization usable by plugins: either by having it
  rely on registries (like we do for formats but the plugin tests are
  still under the bzrlib.tests hierarchy) or by providing test-specific
  registries (I tried this approach for the config stuff but the results
  are far from perfect and the tests are still under the bzrlib.tests
  hierarchy),

- design focused test classes so that plugin can inherit from them only
  for the parts they care about (this requires some expertise about the
  test framework from the plugin authors and don't really scale well
  when tests are added/moved or new test classes introduced).

- have the plugin authors maintain a set of prefixes for 'selftest -s'
  to better define the plugin test coverage (requires good TDD expertise
  and hard to maintain too).

    > "bzr selftest" doesn't pass with a standard set of plugins installed
    > --------------------------------------------------------------------

This is a known issue for years.

The root cause is a vicious circle: if tests start failing for plugins,
bzr devs tends to use 'BZR_PLUGIN_PATH=-site ./bzr selftest' (or, gosh,
even --no-plugins) which means additional failures are not seen...

Adding more plugin tests and keeping them passing is up to plugin
authors/maintainers...

<snip/>

    > No regular testing of plugins against bzr.dev
    > ---------------------------------------------

    > "bzr selftest" gets run very often for "bzr" itself with just the
    > bundled plugins (launchpad, changelog_merge, ...) - on PQM,
    > various platforms in babune, on developer machines, etc.

Yup, core plugins are... core plugins :)

<snip/>

    > Once we fix the previous issue, I'm sure more developers will
    > start running more of the tests. Perhaps it would also be possible
    > to have a babune slave run the tests for all plugin trunks against
    > bzr.dev?

It's on babune's TODO list for quite a long time but doesn't make sense
until we get back to a point where all core tests are passing.

That's another vicious circle: a CI system is valuable only when 100% of
the tests are passing. As soon as you start having even a single
spurious failure, the S/N ratio goes down and there is no point adding
more tests (or rather expect much value out of the CI system, adding
tests in itself can't be bad, can it ? ;).

One way to mitigate that would be to define and maintain different test
suites that we can mix and match differently to suit our needs:

- a critical one for pqm, no exception accepted,

- a less critical one for babune: excluding known spurious failures to
  at least get to a point where babune can be rely upon

    > No 'bzr selftest' run with the bzr and plugins shipped in an installer
    > ---------------------------------------------------------------------

- a post-install targeted test suite for installer builders/packagers

From there we could envision a job running a full test suite on babune
for a set of plugins and a job for each plugin also running the full
test suite with only this additional plugin.

This should provides a basic mean to identify the faulty plugin when the
test suite fails for the whole set.

    > Once we have a working "bzr selftest",

I'll go further based on past evidence: 'once' is not strong enough
here, 'bzr selftest' should *always* pass or we go straight into vicious
circles.

    > this should be easy to do.

Unfortunately it's not. Getting to the point where selftest pass *once*
is easy but we've always fail to keep it running without dedicated
efforts. Granted, this was always for a negligible issue each time, but
since they add up, we're always reaching a point where getting back on
track is harder than it should.

    > And we're less likely to find issues at install time if the full
    > testsuite is already being run regularly.  Of course, it will slow
    > down the release process somewhat, having to wait for the full
    > testsuite for bzr core and all plugins to pass and all.

Release time is not the right time to run heavy testing, this is
precisely what CI and time-based releases are targeting: cutting a
release should be just:

- check that tests have been passing lately,
- check that no critical issues are pending,
- tidy up the news,
- cut the tarball.

I.e. only administrative stuff, no last-minute rush for landing, no bug
fixes, no source changes :) The rationale is that any change requires
testing (which takes time) *and* can fail which delays the release. This
goes against time-based releases and as such should be avoided as much
as possible (common sense should be applied for exceptions as usual).

I'd go as far as saying that if we need to change the release process it
should be by *removing* tasks, never adding new ones.

  Vincent


Re: Release testing and the relationship between 'bzr selftest' and plugins

by Jelmer Vernooij :: Rate this Message:

| View Threaded | Show Only this Message

Hi Vincent,

Am 15/03/12 13:57, schrieb Vincent Ladeuil:

>>>>>> Jelmer Vernooij <jelmer@...> writes:
> <snip/>
>
>     > The API versioning infrastructure doesn't work
>     > ----------------------------------------------
>
> Sad but true.
>
> This means the efforts we put into maintaining compatibility with
> plugins are partly wasted though.

> I can think of several ways for plugins to address the issue:
>
> 1 - merge plugins into core, making them core plugins: we've talked
>     about that long ago but nothing happened. The main benefit is that
>     such plugins will be better maintained since a change in core will
>     be noticed more quickly. This add maintenance costs to bzr core
>     which is mitigated by less efforts on compatibility in both bzrlib
>     and the plugins.
I think this is just one of the ways we can make sure that the plugin
tests get run often during development. But it's not the only way to
make sure changes to the core are tested against the plugins.

Shipping the plugins with core also has a few issues:

 * lp:bzr  (AFAIK) falls under the contributor agreement, and all code
(with some exceptions) is (C) Canonical. Most plugins have different
copyright holders.
 * shipping plugins implies a certain level of support for them
 * plugins can have dependencies - would we start shipping the svn, apr
and mercurial sourcecode with bzr?
 * some plugins have a different landing mechanism than bzr.dev;
requiring review, for example

For some plugins (e.g. bzr-grep?) this might be a good option though,
indeed.

> 2 - push plugin authors to create series targeted at bzr releases: avoid
>     many maintenance issues :) This will also help installer
>     builders/packagers.
For most plugins, this doesn't scale with the number of release series
and the size of the plugins. It isn't worth the effort to maintain
separate release series if it's trivial to be compatible with more
versions of bzr.

For plugins that are tightly coupled with particular bzr versions, like
the foreign branch plugins, this is an option. But it still wouldn't
have prevented the problems we had with the 2.5 installers. Changes
between beta 4 and beta 5 broke the foreign branch plugins, and the
installers shipped with an outdated version of those plugins (from the
correct release series).

>     >  * If applicable, is the bzrlib version known to be too old
>     >  * If applicable, is the bzrlib version known to be too new
>
> Fair enough, but it's up to the plugin authors to do that.
True, but I think apart from working on bzr core we are involved in a
fair number of the plugins. We can also encourage changes to the way API
version checking is done, e.g. by deprecating bzrlib.api.require_any_api().

>     > Plugins tightly coupled to bzr core
>     > -----------------------------------
>
> (1) applies there (as well as 2 to a lesser degree), but again, it's up
> to the plugin authors.
>
>     > As the maintainer of three foreign branch plugins, I run their
>     > testsuites regularly, and usually notice when there is an
>     > incompatible change in bzrlib.  I think I've done a reasonable job
>     > of keeping versions available that are compatible with all
>     > releases.
>
> Thanks for that !
>
> A related issue there is the test IDs name space, some tests can be
> inherited by plugins so that 'bzr selftest -s bp.<plugin>' will include
> them, some don't.
>
> There are ways to make our tests easier to be reused by plugins but
> we're not there yet:
Is the test coverage of plugins really an issue? Speaking for the
foreign plugins, this doesn't really seem to be a problem.

bzrlib.tests.per_branch will run against all foreign branch
implementations too, or "bzr selftest
bzrlib.tests.per_branch.*SvnBranch" will run all svn branch
implementation related tests. This provides pretty good coverage.

>     > "bzr selftest" doesn't pass with a standard set of plugins installed
>     > --------------------------------------------------------------------
>
> This is a known issue for years.
>
> The root cause is a vicious circle: if tests start failing for plugins,
> bzr devs tends to use 'BZR_PLUGIN_PATH=-site ./bzr selftest' (or, gosh,
> even --no-plugins) which means additional failures are not seen...
>
> Adding more plugin tests and keeping them passing is up to plugin
> authors/maintainers...
If a lp:bzr author changes something that breaks a plugin, they should
be noticing and filing bugs. I agree plugin authors (or anybody) should
also be fixing problems in plugins when they come up, but that's a lot
easier if "bzr selftest" (without arguments) actually works.

>
> <snip/>
>
>     > No regular testing of plugins against bzr.dev
>     > ---------------------------------------------
>
>     > "bzr selftest" gets run very often for "bzr" itself with just the
>     > bundled plugins (launchpad, changelog_merge, ...) - on PQM,
>     > various platforms in babune, on developer machines, etc.
>
> Yup, core plugins are... core plugins :)
I don't think this is the magical answer. bundling plugins is just one
of the ways in which we can encourage people to always run the plugin
tests too.

We don't have the bundle the plugins to make sure that various bits of
our infrastructure run selftest with the plugins. Neither does bundling
the plugins guarantee that developers won't start disabling some plugins
that slow down their test runs.

> <snip/>
>
>     > Once we fix the previous issue, I'm sure more developers will
>     > start running more of the tests. Perhaps it would also be possible
>     > to have a babune slave run the tests for all plugin trunks against
>     > bzr.dev?
>
> It's on babune's TODO list for quite a long time but doesn't make sense
> until we get back to a point where all core tests are passing.
>
> That's another vicious circle: a CI system is valuable only when 100% of
> the tests are passing. As soon as you start having even a single
> spurious failure, the S/N ratio goes down and there is no point adding
> more tests (or rather expect much value out of the CI system, adding
> tests in itself can't be bad, can it ? ;).
>
> One way to mitigate that would be to define and maintain different test
> suites that we can mix and match differently to suit our needs:
>
> - a critical one for pqm, no exception accepted,
>
> - a less critical one for babune: excluding known spurious failures to
>   at least get to a point where babune can be rely upon
Can't we perhaps just be more pro-active about spurious failures? I
think we should either fix or disable tests (and file bugs) with
spurious failures rather than keeping them enabled and stumbling over
them constantly.

Tests that flap aren't useful for either PQM or CI, I don't think we
should treat them differently.

>
>     > Once we have a working "bzr selftest",
>
> I'll go further based on past evidence: 'once' is not strong enough
> here, 'bzr selftest' should *always* pass or we go straight into vicious
> circles.
It doesn't at the moment, we have to get to that point first. Hence the
"once". :-)
>
>     > this should be easy to do.
>
> Unfortunately it's not. Getting to the point where selftest pass *once*
> is easy but we've always fail to keep it running without dedicated
> efforts. Granted, this was always for a negligible issue each time, but
> since they add up, we're always reaching a point where getting back on
> track is harder than it should.
If we stay on top of this, it *should* be easy to do. It's not like
there are hundreds of tests suddenly breaking. If we fix regressions in
the plugins as they are introduced, it should be easy to keep up. Once
we neglect the full selftest run, it becomes a lot harder to fix it again.

>     > And we're less likely to find issues at install time if the full
>     > testsuite is already being run regularly.  Of course, it will slow
>     > down the release process somewhat, having to wait for the full
>     > testsuite for bzr core and all plugins to pass and all.
>
> Release time is not the right time to run heavy testing, this is
> precisely what CI and time-based releases are targeting: cutting a
> release should be just:
>
> - check that tests have been passing lately,
> - check that no critical issues are pending,
> - tidy up the news,
> - cut the tarball.
>
> I.e. only administrative stuff, no last-minute rush for landing, no bug
> fixes, no source changes :) The rationale is that any change requires
> testing (which takes time) *and* can fail which delays the release. This
> goes against time-based releases and as such should be avoided as much
> as possible (common sense should be applied for exceptions as usual).
>
> I'd go as far as saying that if we need to change the release process it
> should be by *removing* tasks, never adding new ones.
I'm only saying there should be a final "bzr selftest" run to verify
everything is ok, not that this is a point to find and fix all
compatibility issues. If we have proper CI and run "bzr selftest" with
plugins regularly, then this will almost certainly pass. But a last
check like this will prevent brown paper bag releases of the installers,
as we had for 2.5.0. And that costs even more RM time.

Cheers,

Jelmer



signature.asc (918 bytes) Download Attachment

Re: Release testing and the relationship between 'bzr selftest' and plugins

by Wichmann, Mats D :: Rate this Message:

| View Threaded | Show Only this Message



On Thu, Mar 15, 2012 at 6:57 AM, Vincent Ladeuil <vila%2Bbzr@...> wrote:

1 - merge plugins into core, making them core plugins: we've talked
   about that long ago but nothing happened. The main benefit is that
   such plugins will be better maintained since a change in core will
   be noticed more quickly. This add maintenance costs to bzr core
   which is mitigated by less efforts on compatibility in both bzrlib
   and the plugins.


I guess we could say this would mirror the kernel.org policy:  push your
code into core; if not, things will likely break.  If it's in core, things will
get tested.  This would also imply some quality requirements: to get it
accepted, code will have to be reviewed, tests have to be of high quality, etc.
My impression is a fair number of plugins are temporary itch-scratchers -
I've certainly run into several that did something, but then weren't really
needed by the author any more and sat around kind of rotting... it would
be nice if a process kind of pushed for the idea to be reviewed - is there
a better way to do this, does it belong in core, etc.

Sounds cool, but I'm not sure it completely applies.  core grows ever
bigger - is that a problem for packages?  Does it stretch the limited-resource
core team too much in dealing with an ever growing set of bits? etc.
In a sense, the whole idea of a plugin arch is that core doesn't have to
deal with everything, others can contribute to solve specific problems.


Re: Release testing and the relationship between 'bzr selftest' and plugins

by Gordon Tyler :: Rate this Message:

| View Threaded | Show Only this Message

Regarding running `bzr selftest` as part of the installer building process, I started `time ./bzr selftest --parallel=fork --no-plugins` this morning on the Macbook I use to build the OS X installers so I can see how long it would take. Building the installer itself already takes about 30 minutes for a completely clean build, although I can usually do an incremental build for minor plugin version changes. The primary culprit is compiling pyqt.

My suspicion is that it's going to take a very long time. The Macbook is kinda dinky in comparison to my Windows desktop machine and the desktop took a very long time to run the full suite.

Ciao,
Gordon


Re: Release testing and the relationship between 'bzr selftest' and plugins

by Vincent Ladeuil-2 :: Rate this Message:

| View Threaded | Show Only this Message

>>>>> Jelmer Vernooij <jelmer@...> writes:

<snip/>

    >> 1 - merge plugins into core, making them core plugins: we've talked
    >> about that long ago but nothing happened. The main benefit is that
    >> such plugins will be better maintained since a change in core will
    >> be noticed more quickly. This add maintenance costs to bzr core
    >> which is mitigated by less efforts on compatibility in both bzrlib
    >> and the plugins.

    > I think this is just one of the ways we can make sure that the plugin
    > tests get run often during development. But it's not the only way to
    > make sure changes to the core are tested against the plugins.

Indeed, I wasn't implying anything else ;)

    > Shipping the plugins with core also has a few issues:

    >  * lp:bzr  (AFAIK) falls under the contributor agreement, and all code
    > (with some exceptions) is (C) Canonical. Most plugins have different
    > copyright holders.

Hmm, this one is hard.

    >  * shipping plugins implies a certain level of support for them

Yup, the balance may not be easy to find between time spent ensuring
backward compatibility in bzr-core, compatibility with various bzr
versions in the plugin itself and packaging the right versions as
opposed to a single code base.

    >  * plugins can have dependencies - would we start shipping the svn, apr
    > and mercurial sourcecode with bzr?
    >  * some plugins have a different landing mechanism than bzr.dev;
    > requiring review, for example

I'd say soft dependencies in bzr-core and build dependencies for
packages or is should that be recommendations instead ?

    > For some plugins (e.g. bzr-grep?) this might be a good option though,
    > indeed.

Yup, I was thinking about bzr-webdav which is very stable but has been
broken with 2.5.

    >> 2 - push plugin authors to create series targeted at bzr releases: avoid
    >> many maintenance issues :) This will also help installer
    >> builders/packagers.
    > For most plugins, this doesn't scale with the number of release series
    > and the size of the plugins. It isn't worth the effort to maintain
    > separate release series if it's trivial to be compatible with more
    > versions of bzr.

Balance to be found again, some plugins may just want to tag specific
revisions for a given series if they don't evolve a lot between series.

    > For plugins that are tightly coupled with particular bzr versions,
    > like the foreign branch plugins, this is an option. But it still
    > wouldn't have prevented the problems we had with the 2.5
    > installers. Changes between beta 4 and beta 5 broke the foreign
    > branch plugins, and the installers shipped with an outdated
    > version of those plugins (from the correct release series).

Sure, but at least the packagers can subscribe to the tip of a given
branch and be done.

    >> >  * If applicable, is the bzrlib version known to be too old
    >> >  * If applicable, is the bzrlib version known to be too new
    >>
    >> Fair enough, but it's up to the plugin authors to do that.

    > True, but I think apart from working on bzr core we are involved
    > in a fair number of the plugins. We can also encourage changes to
    > the way API version checking is done, e.g. by deprecating
    > bzrlib.api.require_any_api().

Indeed.

    >> > Plugins tightly coupled to bzr core
    >> > -----------------------------------
    >>
    >> (1) applies there (as well as 2 to a lesser degree), but again, it's up
    >> to the plugin authors.
    >>
    >> > As the maintainer of three foreign branch plugins, I run their
    >> > testsuites regularly, and usually notice when there is an
    >> > incompatible change in bzrlib.  I think I've done a reasonable job
    >> > of keeping versions available that are compatible with all
    >> > releases.
    >>
    >> Thanks for that !
    >>
    >> A related issue there is the test IDs name space, some tests can be
    >> inherited by plugins so that 'bzr selftest -s bp.<plugin>' will include
    >> them, some don't.
    >>
    >> There are ways to make our tests easier to be reused by plugins but
    >> we're not there yet:

    > Is the test coverage of plugins really an issue? Speaking for the
    > foreign plugins, this doesn't really seem to be a problem.

Issue may be a too strong word, what I meant is that for a plugin author
there is a *big* difference between running:

  # runs only the plugin tests
  bzr selftest -s bp.<plugin>

and

  # run all tests including the plugin ones
  BZR_PLUGINS_AT=<plugin>@`pwd` bzr selftest

Ideally the former should be enough for the plugin author, it's true for
only a handful of plugins so far.

    > bzrlib.tests.per_branch will run against all foreign branch
    > implementations too, or "bzr selftest
    > bzrlib.tests.per_branch.*SvnBranch" will run all svn branch
    > implementation related tests. This provides pretty good coverage.

Yup, that's what I was thinking with a list of known prefixes to run for
a given plugin.

    >> > "bzr selftest" doesn't pass with a standard set of plugins installed
    >> > --------------------------------------------------------------------
    >>
    >> This is a known issue for years.
    >>
    >> The root cause is a vicious circle: if tests start failing for plugins,
    >> bzr devs tends to use 'BZR_PLUGIN_PATH=-site ./bzr selftest' (or, gosh,
    >> even --no-plugins) which means additional failures are not seen...
    >>
    >> Adding more plugin tests and keeping them passing is up to plugin
    >> authors/maintainers...

    > If a lp:bzr author changes something that breaks a plugin, they
    > should be noticing and filing bugs.  I agree plugin authors (or
    > anybody) should also be fixing problems in plugins when they come
    > up, but that's a lot easier if "bzr selftest" (without arguments)
    > actually works.

Right. So, I did that for a long time and lose steam.

While working on a given fix, plugin test failures are disruptive...

May be I should say 'were disruptive with no easy opt-out mechanism', I
think by the time BZR_DISABLE_PLUGINS was introduced I had already fall
into the BZR_PLUGIN_PATH=-site'ly trigger-happy camp :-/

<snip/>

    >> Yup, core plugins are... core plugins :)

    > I don't think this is the magical answer. bundling plugins is just
    > one of the ways in which we can encourage people to always run the
    > plugin tests too.

Sure, there was a smiley there ;)

    > We don't have the bundle the plugins to make sure that various
    > bits of our infrastructure run selftest with the plugins. Neither
    > does bundling the plugins guarantee that developers won't start
    > disabling some plugins that slow down their test runs.

Hence we need a CI system but as mentioned, a CI system has high
requirements: failing tests should be dealt with asap before the S/N
ratio drops.

    >> <snip/>
    >>
    >> > Once we fix the previous issue, I'm sure more developers will
    >> > start running more of the tests. Perhaps it would also be possible
    >> > to have a babune slave run the tests for all plugin trunks against
    >> > bzr.dev?
    >>
    >> It's on babune's TODO list for quite a long time but doesn't make sense
    >> until we get back to a point where all core tests are passing.
    >>
    >> That's another vicious circle: a CI system is valuable only when 100% of
    >> the tests are passing. As soon as you start having even a single
    >> spurious failure, the S/N ratio goes down and there is no point adding
    >> more tests (or rather expect much value out of the CI system, adding
    >> tests in itself can't be bad, can it ? ;).
    >>
    >> One way to mitigate that would be to define and maintain different test
    >> suites that we can mix and match differently to suit our needs:
    >>
    >> - a critical one for pqm, no exception accepted,
    >>
    >> - a less critical one for babune: excluding known spurious failures to
    >> at least get to a point where babune can be rely upon

    > Can't we perhaps just be more pro-active about spurious failures?

As in tackling https://bugs.launchpad.net/bzr/+bugs?field.tag=babune and
https://bugs.launchpad.net/bzr/+bugs?field.tag=selftest you mean ?

    > I think we should either fix or disable tests (and file bugs) with
    > spurious failures rather than keeping them enabled and stumbling
    > over them constantly.

    > Tests that flap aren't useful for either PQM or CI, I don't think we
    > should treat them differently.

Right, we had enough of them to decorate them may be ? I did exclude
tests on babune at one point but this is not a good solution as I forgot
about them at one point so we need some in-core tracking to get a better
visibility.

Probably something along the lines of re-trying once and warns if it
fail twice but don't let selftest itself fail and emit a final summary
mentioning the number of such spurious failures.

    >>
    >> > Once we have a working "bzr selftest",
    >>
    >> I'll go further based on past evidence: 'once' is not strong enough
    >> here, 'bzr selftest' should *always* pass or we go straight into vicious
    >> circles.
    > It doesn't at the moment, we have to get to that point first. Hence the
    > "once". :-)

Hehe, yeah, what I meant is that we said 'once' several times in the
past, I think we should change *something* if we want to get out of this
habit ;)

    >>
    >> > this should be easy to do.
    >>
    >> Unfortunately it's not. Getting to the point where selftest pass *once*
    >> is easy but we've always fail to keep it running without dedicated
    >> efforts. Granted, this was always for a negligible issue each time, but
    >> since they add up, we're always reaching a point where getting back on
    >> track is harder than it should.
    > If we stay on top of this, it *should* be easy to do. It's not like
    > there are hundreds of tests suddenly breaking. If we fix regressions in
    > the plugins as they are introduced, it should be easy to keep up. Once
    > we neglect the full selftest run, it becomes a lot harder to fix it again.

Exactly.

    >> > And we're less likely to find issues at install time if the full
    >> > testsuite is already being run regularly.  Of course, it will slow
    >> > down the release process somewhat, having to wait for the full
    >> > testsuite for bzr core and all plugins to pass and all.
    >>
    >> Release time is not the right time to run heavy testing, this is
    >> precisely what CI and time-based releases are targeting: cutting a
    >> release should be just:
    >>
    >> - check that tests have been passing lately,
    >> - check that no critical issues are pending,
    >> - tidy up the news,
    >> - cut the tarball.
    >>
    >> I.e. only administrative stuff, no last-minute rush for landing, no bug
    >> fixes, no source changes :) The rationale is that any change requires
    >> testing (which takes time) *and* can fail which delays the release. This
    >> goes against time-based releases and as such should be avoided as much
    >> as possible (common sense should be applied for exceptions as usual).
    >>
    >> I'd go as far as saying that if we need to change the release process it
    >> should be by *removing* tasks, never adding new ones.
    > I'm only saying there should be a final "bzr selftest" run to verify
    > everything is ok, not that this is a point to find and fix all
    > compatibility issues. If we have proper CI and run "bzr selftest" with
    > plugins regularly, then this will almost certainly pass. But a last
    > check like this will prevent brown paper bag releases of the installers,
    > as we had for 2.5.0. And that costs even more RM time.

Indeed.

So, it that wasn't clear, let me re-iterate: I'm in full agreement
about:

- spending more time on ensuring that the full test suite is always
  passing,

- tweaking the 'full test suite' definition so it matches what we really
  care about (this means tagging spurious failures in a way that ensure
  that they are addressed, adding whatever plugins we think are worth
  the maintenance effort and <other ideas>)

I think we agree far more than we disagree on most of the topics so
let's address the ones we agree on ;)

      Vincent


Re: Release testing and the relationship between 'bzr selftest' and plugins

by Vincent Ladeuil-2 :: Rate this Message:

| View Threaded | Show Only this Message

>>>>> Wichmann, Mats D <mats.d.wichmann@...> writes:

    > On Thu, Mar 15, 2012 at 6:57 AM, Vincent Ladeuil <vila+bzr@...>wrote:
    >>
    >> 1 - merge plugins into core, making them core plugins: we've talked
    >> about that long ago but nothing happened. The main benefit is that
    >> such plugins will be better maintained since a change in core will
    >> be noticed more quickly. This add maintenance costs to bzr core
    >> which is mitigated by less efforts on compatibility in both bzrlib
    >> and the plugins.



    > I guess we could say this would mirror the kernel.org policy: push
    > your code into core; if not, things will likely break.  If it's in
    > core, things will get tested.
    > This would also imply some quality requirements: to get it
    > accepted, code will have to be reviewed, tests have to be of high
    > quality, etc.

Yup, this was mentioned as pre-requisites for inclusion when it was
discussed and still holds.

    > My impression is a fair number of plugins are temporary
    > itch-scratchers - I've certainly run into several that did
    > something, but then weren't really needed by the author any more
    > and sat around kind of rotting... it would be nice if a process
    > kind of pushed for the idea to be reviewed - is there a better way
    > to do this, does it belong in core, etc.


    > Sounds cool, but I'm not sure it completely applies.  core grows ever
    > bigger - is that a problem for packages?  Does it stretch the
    > limited-resource
    > core team too much in dealing with an ever growing set of bits? etc.
    > In a sense, the whole idea of a plugin arch is that core doesn't have to
    > deal with everything, others can contribute to solve specific problems.

Yup, I'm not proposing to include everything and the core team resources
are indeed limited so we won't be able to maintain each and every plugin
anyway.

The idea is more to find which plugins add enough value to bzr itself
that our time is better spent fixing breakage as soon at it appears than
having to fix it anyway long after which is always more costly.

The issue with trying to maintain compatibility when the core changes is
that we are often too conservative because it's hard to know if a change
will really break some plugin and if maintaining compatibility is harder
than fixing the plugin itself.

       Vincent


Re: Release testing and the relationship between 'bzr selftest' and plugins

by Vincent Ladeuil-2 :: Rate this Message:

| View Threaded | Show Only this Message

>>>>> Gordon Tyler <gordon@...> writes:

    > Regarding running `bzr selftest` as part of the installer building process,
    > I started `time ./bzr selftest --parallel=fork --no-plugins` this morning
    > on the Macbook I use to build the OS X installers so I can see how long it
    > would take. Building the installer itself already takes about 30 minutes
    > for a completely clean build, although I can usually do an incremental
    > build for minor plugin version changes. The primary culprit is compiling
    > pyqt.

    > My suspicion is that it's going to take a very long time. The Macbook is
    > kinda dinky in comparison to my Windows desktop machine and the desktop
    > took a very long time to run the full suite.

Tests are slower on windows to start with (unless this has changed
recently).

On OSX, using a ram disk is a serious boost especially on laptops where
hard disks are often of the 5400rpm specie.

I think I had a recipe for that, let me dig...

#!/bin/sh
# Size in MB
SIZE=${1:-512}
SIZE_IN_BYTES=`expr ${SIZE} \* 2048`
# Path
RAMDISK=${2:-ramdisk}
if [ ! -e "/Volumes/${RAMDISK}" ] ; then
    RAW_DEV=`hdiutil attach -nomount ram://${SIZE_IN_BYTES}`
    diskutil erasevolume HFS+ "${RAMDISK}" ${RAW_DEV}
fi
# to get rid of the ramdisk:
# hdiutil detach ${RAW_DEV}

Then 'TMP=/Volumes/${RAMDISK} bzr selfest' should do.

Now, of course, --parallel=fork in itself consume memory so you have to
find a balance to avoid swapping :)

     Vincent

P.S.: Re-reading the script makes me wonder why 2048 is used
      here... instead of the usual 1024 ;)


Re: Release testing and the relationship between 'bzr selftest' and plugins

by Alexander Belchenko :: Rate this Message:

| View Threaded | Show Only this Message

Vincent Ladeuil пишет:
>>>>>> Jelmer Vernooij <jelmer@...> writes:
>
> <snip/>
>
>     > The API versioning infrastructure doesn't work
>     > ----------------------------------------------
>
> Sad but true.

I think the problem is in the fact that bzrlib is too big and has very
big public API: a lot of classes and methods. Multiply it to the
possible changes in algorithms or main ideas behind those classes. E.g.
whether some code should use Inventory or it shouldn't.

The main source of frustration is in the fact that changes in ideas
behind classes are backward incompatible, so that breaks plugins.

Changes in API itself are lesser evil for me, but it's often annoying,
e.g. when some public method gets new non-optional argument.

I'd like to share what I'm trying to do in qbzr and bzr-explorer (as the
prominent example of client application has been built on top of qbzr)
to achieve better backward compatibility between different combination
of their versions.  I'm not quite sure how much would it help there.

 From my non-bzr experience, using version numbers is not very good when
you have too much moving parts. So, much better if you have a set of
flags which are indicated features supported or provided in the current
version. If client needs to use some specific feature of the base code
then it should check features set, and therefore client can adjust its
behavior, and even support old codepath and new (without/with a feature).

So in qbzr I made special dictionary FEATURES which could provide to
clients information about availability of some new features (comparing
to some arbitrary selected old version), and maybe even version of
feature. Feature name is the key in FEATURES dictionary, value
corresponding to this key could provide additional information.

I can't say how it works in the long run, because I've added only one
item to FEATURES: qignore. Using qignore feature I was able to implement
proper backward-compatible support in bzr-explorer around qignore.

I remember sometimes in the past, when we had forced to update QBzr or
some other plugins because of small but annoying changes in bzrlib, I
wished to have an intermediate proxy on top of bzrlib API that could
adapt to minor API changes in bzrlib and provide a consistent interface
to bzrlib internals required by QBzr needs.

If you look at QBzr internals you could see that only small amount of
q-commands works with bzrlib API directly, and all those commands are in
fact "browsers/viewers" (browse log, working tree/revision tree browser,
diff viewer, annotations viewer).

Most of q-commands provided by QBzr are just front-ends to construct
valid bzr command line and execute `bzr blah...` as subprocess. This
approach has proved to be very successful - we cover most of the bzr CLI
required by bzr-explorer without any effort from our side to keep
compatibility with bzrlib internals.

So, maybe reducing the amount of public bzrlib API will help here? Or
provide some consistent set of methods as an adaptor between plugins
needs and bzrlib internals? Of course plugins authors should understand
what they need from such interface. Of course plugins like bzr-svn or
bzr-git won't need such adaptor, they have to dive deep into bzrlib.

Maybe such adaptor could be a core plugin? I'm not quite sure.
But such adaptor should be very conservative to able to operate with old
versions of plugins. Maybe such adaptor could provide new bzrlib
features by new names, so plugins could get benefits of using newer
features if they are aware of it.

I know there is too much "if" and "maybe", but this is what has been
sitting in my head for a long time.


Re: Release testing and the relationship between 'bzr selftest' and plugins

by Jelmer Vernooij :: Rate this Message:

| View Threaded | Show Only this Message

Am 15/03/12 16:29, schrieb Alexander Belchenko:

> Vincent Ladeuil пишет:
>>>>>>> Jelmer Vernooij <jelmer@...> writes:
>>
>> <snip/>
>>
>>     > The API versioning infrastructure doesn't work
>>     > ----------------------------------------------
>>
>> Sad but true.
>
> I think the problem is in the fact that bzrlib is too big and has very
> big public API: a lot of classes and methods. Multiply it to the
> possible changes in algorithms or main ideas behind those classes.
> E.g. whether some code should use Inventory or it shouldn't.
>
> The main source of frustration is in the fact that changes in ideas
> behind classes are backward incompatible, so that breaks plugins.
>
> Changes in API itself are lesser evil for me, but it's often annoying,
> e.g. when some public method gets new non-optional argument.
<snip/>
> I remember sometimes in the past, when we had forced to update QBzr or
> some other plugins because of small but annoying changes in bzrlib, I
> wished to have an intermediate proxy on top of bzrlib API that could
> adapt to minor API changes in bzrlib and provide a consistent
> interface to bzrlib internals required by QBzr needs.
Why does it have to be a proxy? Is there any particular reason we
shouldn't just make the changes in bzrlib itself more robust by e.g. not
adding non-optional arguments?


> If you look at QBzr internals you could see that only small amount of
> q-commands works with bzrlib API directly, and all those commands are
> in fact "browsers/viewers" (browse log, working tree/revision tree
> browser, diff viewer, annotations viewer).
>
> Most of q-commands provided by QBzr are just front-ends to construct
> valid bzr command line and execute `bzr blah...` as subprocess. This
> approach has proved to be very successful - we cover most of the bzr
> CLI required by bzr-explorer without any effort from our side to keep
> compatibility with bzrlib internals.
Is the relatively low number of issues you've seen with qbzr due to the
fact that the command line interface is used, though? bzr-gtk uses
roughly the same functionality as qbzr but through the bzrlib API and it
also hasn't seen as many issues. I think for the majority of plugins
which use the well-known stable APIs we haven't seen many issues. It's
the somewhat lower level APIs that are more prone to changes that are
problematic.

>
> So, maybe reducing the amount of public bzrlib API will help here? Or
> provide some consistent set of methods as an adaptor between plugins
> needs and bzrlib internals? Of course plugins authors should
> understand what they need from such interface. Of course plugins like
> bzr-svn or bzr-git won't need such adaptor, they have to dive deep
> into bzrlib.
>
> Maybe such adaptor could be a core plugin? I'm not quite sure.
> But such adaptor should be very conservative to able to operate with
> old versions of plugins. Maybe such adaptor could provide new bzrlib
> features by new names, so plugins could get benefits of using newer
> features if they are aware of it.
Is there a particular reason this has to be a features dictionary,
rather than e.g. an object in a module the plugins could look for ? This
is how bzr-svn/bzr-git/bzr-hg do most of their compatibility checking.

Cheers,

Jelmer



signature.asc (918 bytes) Download Attachment

Re: Release testing and the relationship between 'bzr selftest' and plugins

by Jelmer Vernooij :: Rate this Message:

| View Threaded | Show Only this Message

Am 15/03/12 16:11, schrieb Vincent Ladeuil:

>>>>>> Jelmer Vernooij <jelmer@...> writes:
>
>
>     >  * plugins can have dependencies - would we start shipping the svn, apr
>     > and mercurial sourcecode with bzr?
>     >  * some plugins have a different landing mechanism than bzr.dev;
>     > requiring review, for example
>
> I'd say soft dependencies in bzr-core and build dependencies for
> packages or is should that be recommendations instead ?
We have to bundle the build dependencies, otherwise the plugins can't be
used and thus bundling them becomes pointless (since e.g. PQM will skip
all the tests).

If we require e.g. PQM to have the dependencies pre-installed then we
end up with another problem, which is ensuring that the right build
dependencies are installed.

>     >> 2 - push plugin authors to create series targeted at bzr releases: avoid
>     >> many maintenance issues :) This will also help installer
>     >> builders/packagers.
>     > For most plugins, this doesn't scale with the number of release series
>     > and the size of the plugins. It isn't worth the effort to maintain
>     > separate release series if it's trivial to be compatible with more
>     > versions of bzr.
>
> Balance to be found again, some plugins may just want to tag specific
> revisions for a given series if they don't evolve a lot between series.
>
>     > For plugins that are tightly coupled with particular bzr versions,
>     > like the foreign branch plugins, this is an option. But it still
>     > wouldn't have prevented the problems we had with the 2.5
>     > installers. Changes between beta 4 and beta 5 broke the foreign
>     > branch plugins, and the installers shipped with an outdated
>     > version of those plugins (from the correct release series).
>
> Sure, but at least the packagers can subscribe to the tip of a given
> branch and be done.
In practice neither of these seem to happen though.


>
>
>     > We don't have the bundle the plugins to make sure that various
>     > bits of our infrastructure run selftest with the plugins. Neither
>     > does bundling the plugins guarantee that developers won't start
>     > disabling some plugins that slow down their test runs.
>
> Hence we need a CI system but as mentioned, a CI system has high
> requirements: failing tests should be dealt with asap before the S/N
> ratio drops.
+1

>
>     >> <snip/>
>     >>
>     >> > Once we fix the previous issue, I'm sure more developers will
>     >> > start running more of the tests. Perhaps it would also be possible
>     >> > to have a babune slave run the tests for all plugin trunks against
>     >> > bzr.dev?
>     >>
>     >> It's on babune's TODO list for quite a long time but doesn't make sense
>     >> until we get back to a point where all core tests are passing.
>     >>
>     >> That's another vicious circle: a CI system is valuable only when 100% of
>     >> the tests are passing. As soon as you start having even a single
>     >> spurious failure, the S/N ratio goes down and there is no point adding
>     >> more tests (or rather expect much value out of the CI system, adding
>     >> tests in itself can't be bad, can it ? ;).
>     >>
>     >> One way to mitigate that would be to define and maintain different test
>     >> suites that we can mix and match differently to suit our needs:
>     >>
>     >> - a critical one for pqm, no exception accepted,
>     >>
>     >> - a less critical one for babune: excluding known spurious failures to
>     >> at least get to a point where babune can be rely upon
>
>     > Can't we perhaps just be more pro-active about spurious failures?
>
> As in tackling https://bugs.launchpad.net/bzr/+bugs?field.tag=babune and
> https://bugs.launchpad.net/bzr/+bugs?field.tag=selftest you mean ?
Hah, thanks! Didn't realize you had filed those. We should also address
the tests by fixing them or disabling them (and opening bugs).

>     > I think we should either fix or disable tests (and file bugs) with
>     > spurious failures rather than keeping them enabled and stumbling
>     > over them constantly.
>
>     > Tests that flap aren't useful for either PQM or CI, I don't think we
>     > should treat them differently.
>
> Right, we had enough of them to decorate them may be ? I did exclude
> tests on babune at one point but this is not a good solution as I forgot
> about them at one point so we need some in-core tracking to get a better
> visibility.
>
> Probably something along the lines of re-trying once and warns if it
> fail twice but don't let selftest itself fail and emit a final summary
> mentioning the number of such spurious failures.
Doing this just for tests that are known to be spurious you mean, or for
all tests ? I wouldn't be in favor of the latter.

>     >> > And we're less likely to find issues at install time if the full
>     >> > testsuite is already being run regularly.  Of course, it will slow
>     >> > down the release process somewhat, having to wait for the full
>     >> > testsuite for bzr core and all plugins to pass and all.
>     >>
>     >> Release time is not the right time to run heavy testing, this is
>     >> precisely what CI and time-based releases are targeting: cutting a
>     >> release should be just:
>     >>
>     >> - check that tests have been passing lately,
>     >> - check that no critical issues are pending,
>     >> - tidy up the news,
>     >> - cut the tarball.
>     >>
>     >> I.e. only administrative stuff, no last-minute rush for landing, no bug
>     >> fixes, no source changes :) The rationale is that any change requires
>     >> testing (which takes time) *and* can fail which delays the release. This
>     >> goes against time-based releases and as such should be avoided as much
>     >> as possible (common sense should be applied for exceptions as usual).
>     >>
>     >> I'd go as far as saying that if we need to change the release process it
>     >> should be by *removing* tasks, never adding new ones.
>     > I'm only saying there should be a final "bzr selftest" run to verify
>     > everything is ok, not that this is a point to find and fix all
>     > compatibility issues. If we have proper CI and run "bzr selftest" with
>     > plugins regularly, then this will almost certainly pass. But a last
>     > check like this will prevent brown paper bag releases of the installers,
>     > as we had for 2.5.0. And that costs even more RM time.
>
> Indeed.
>
> So, it that wasn't clear, let me re-iterate: I'm in full agreement
> about:
>
> - spending more time on ensuring that the full test suite is always
>   passing,
>
> - tweaking the 'full test suite' definition so it matches what we really
>   care about (this means tagging spurious failures in a way that ensure
>   that they are addressed, adding whatever plugins we think are worth
>   the maintenance effort and <other ideas>)
>
> I think we agree far more than we disagree on most of the topics so
> let's address the ones we agree on ;)
Works for me ! :-) So, let's:

 * Run "bzr selftest" and file bugs for issues we encounter
 * Fix said bugs
 * Run "bzr selftest" while we sleep
 * Run "bzr selftest" during lunch break
 * Run "bzr selftest" in the shower
 * ...
 * Can we run "bzr selftest" with a set of passing plugins installed on
babune? We can start with just one and add more as we verify they pass
the testsuite

Cheers,

Jelmer



signature.asc (918 bytes) Download Attachment

Re: Release testing and the relationship between 'bzr selftest' and plugins

by Vincent Ladeuil-2 :: Rate this Message:

| View Threaded | Show Only this Message

>>>>> Jelmer Vernooij <jelmer@...> writes:

<snip/>

    >> I'd say soft dependencies in bzr-core and build dependencies for
    >> packages or is should that be recommendations instead ?
    > We have to bundle the build dependencies, otherwise the plugins can't be
    > used and thus bundling them becomes pointless (since e.g. PQM will skip
    > all the tests).

    > If we require e.g. PQM to have the dependencies pre-installed then we
    > end up with another problem, which is ensuring that the right build
    > dependencies are installed.

Which we need to address one way or the other if we want to test the
plugins anyway.

Now, PQM will have a hard time as the dependencies needs to be installed
and they may change from one bzr series to the other and PQM still needs
to run the tests for all supported series...

Which reminds me that we really want to run the tests for the supported
series instead of just trunk (outside of pqm that is).

    >> >> 2 - push plugin authors to create series targeted at bzr releases: avoid
    >> >> many maintenance issues :) This will also help installer
    >> >> builders/packagers.
    >> > For most plugins, this doesn't scale with the number of release series
    >> > and the size of the plugins. It isn't worth the effort to maintain
    >> > separate release series if it's trivial to be compatible with more
    >> > versions of bzr.
    >>
    >> Balance to be found again, some plugins may just want to tag specific
    >> revisions for a given series if they don't evolve a lot between series.
    >>
    >> > For plugins that are tightly coupled with particular bzr versions,
    >> > like the foreign branch plugins, this is an option. But it still
    >> > wouldn't have prevented the problems we had with the 2.5
    >> > installers. Changes between beta 4 and beta 5 broke the foreign
    >> > branch plugins, and the installers shipped with an outdated
    >> > version of those plugins (from the correct release series).
    >>
    >> Sure, but at least the packagers can subscribe to the tip of a given
    >> branch and be done.
    > In practice neither of these seem to happen though.

Huh ? Do you mean, the plugin authors don't participate enough or that
packagers don't want to use these tricks ?

qbzr maintain different series.

The osx installer (and the windows installer too AFAIK) has a script to
download from a branch (tip) or a tag or a revid, so it's just a matter
of providing either a branch (during betas) or a tag for stable
releases.

<snip/>

    >>
    >> As in tackling https://bugs.launchpad.net/bzr/+bugs?field.tag=babune and
    >> https://bugs.launchpad.net/bzr/+bugs?field.tag=selftest you mean ?

    > Hah, thanks! Didn't realize you had filed those. We should also
    > address the tests by fixing them or disabling them (and opening
    > bugs).

Right, 'and opening bugs' indeed, with a decorator as the one mentioned
below this could reduce the workload enough to turn it into a good
habit.

    >> > I think we should either fix or disable tests (and file bugs) with
    >> > spurious failures rather than keeping them enabled and stumbling
    >> > over them constantly.
    >>
    >> > Tests that flap aren't useful for either PQM or CI, I don't think we
    >> > should treat them differently.
    >>
    >> Right, we had enough of them to decorate them may be ? I did exclude
    >> tests on babune at one point but this is not a good solution as I forgot
    >> about them at one point so we need some in-core tracking to get a better
    >> visibility.
    >>
    >> Probably something along the lines of re-trying once and warns if it
    >> fail twice but don't let selftest itself fail and emit a final summary
    >> mentioning the number of such spurious failures.
    > Doing this just for tests that are known to be spurious you mean, or for
    > all tests ? I wouldn't be in favor of the latter.

Eeerk, only for the spurious yes :)

<snip/>

    >> I think we agree far more than we disagree on most of the topics so
    >> let's address the ones we agree on ;)
    > Works for me ! :-) So, let's:

    >  * Run "bzr selftest" and file bugs for issues we encounter
    >  * Fix said bugs
    >  * Run "bzr selftest" while we sleep
    >  * Run "bzr selftest" during lunch break
    >  * Run "bzr selftest" in the shower
    >  * ...
    >  * Can we run "bzr selftest" with a set of passing plugins installed on
    > babune? We can start with just one and add more as we verify they pass
    > the testsuite

Yeah, that's what I was working on the last time I worked on babune bug
fall into the rabbit hole encountering issues with subunit/testools that
are currently mounted from my home directory instead of being properly
deployed, the need is roughly the same (all slaves needs some deployment
step before a job is run, the hacks I had in place start breaking when I
upgraded to precise).

Once this is fixed, the freebsd will regain some color and we can
restart from there.

          Vincent


Re: Release testing and the relationship between 'bzr selftest' and plugins

by Vincent Ladeuil-2 :: Rate this Message:

| View Threaded | Show Only this Message

Still on this subject but from a slightly different pov, you maintain a
set of recipes to build bzr and various plugins in the daily PPA.

Since they all run the tests can you give some feedback about:

- whether this is a good way to catch regressions,

- what kind of errors do you encounter and fix from them,

- how you proceed to filter out and extract the significant bits from
  the lp mails.

I confess I rarely drill down enough to understand why some build fails
or not nor can I understand (from the mail subjects only) whether a
failed build has been fixed...

Thanks in advance for your feedback there,

       Vincent


Re: Release testing and the relationship between 'bzr selftest' and plugins

by Jelmer Vernooij :: Rate this Message:

| View Threaded | Show Only this Message

Am 16/03/12 10:55, schrieb Vincent Ladeuil:
Still on this subject but from a slightly different pov, you maintain a
set of recipes to build bzr and various plugins in the daily PPA.

Since they all run the tests can you give some feedback about:

- whether this is a good way to catch regressions,
I don't think it is. The plugins are only built when their tip changes, so we usually don't notice breakage until somebody is also changing the plugin (in which case they run the testsuite anyway).

The other problem with it is that you get a single plain text file for each plugin, with all build and test output. That makes it a bit awkward to navigate the failures and to figure out what they are.

We used to run with --one, but that no longer appears to be the case.
- what kind of errors do you encounter and fix from them,
Most of the errors I encounter are due to:

 * Problems with the packaging (missing dependency, et
 * Bugs related to environment (the version of a dependency in precise has removed a particular attribute, something only works on a particular platform)
* *Some* are due to breakage with bzr.dev,

- how you proceed to filter out and extract the significant bits from
  the lp mails.
I don't :-) I regularly go to http://people.canonical.com/~jelmer/recipe-status/bzr.html and fix issues that show up there.

Cheers,

Jelmer


signature.asc (918 bytes) Download Attachment

Re: Release testing and the relationship between 'bzr selftest' and plugins

by Gordon Tyler :: Rate this Message:

| View Threaded | Show Only this Message

My Mac took 38 minutes to run `bzr selftest --parallel=fork --no-plugins`. There were 3 failures and 27 errors. I'll have a look later to see what's up with these errors and I'll also do another run with plugins to see what happens.

Ciao,
Gordon


Re: Release testing and the relationship between 'bzr selftest' and plugins

by Jelmer Vernooij :: Rate this Message:

| View Threaded | Show Only this Message

Am 16/03/12 09:22, schrieb Vincent Ladeuil:

>>>>>> Jelmer Vernooij <jelmer@...> writes:
> <snip/>
>
>     >> I'd say soft dependencies in bzr-core and build dependencies for
>     >> packages or is should that be recommendations instead ?
>     > We have to bundle the build dependencies, otherwise the plugins can't be
>     > used and thus bundling them becomes pointless (since e.g. PQM will skip
>     > all the tests).
>
>     > If we require e.g. PQM to have the dependencies pre-installed then we
>     > end up with another problem, which is ensuring that the right build
>     > dependencies are installed.
>
> Which we need to address one way or the other if we want to test the
> plugins anyway.
>
> Now, PQM will have a hard time as the dependencies needs to be installed
> and they may change from one bzr series to the other and PQM still needs
> to run the tests for all supported series...
At that point, there isn't really that much of an advantage in bundled
plugins though. We have to have some mechanism of checking out external
code, building it and updating anyway. If we're doing that for libapr,
libsvn, subvertpy, dulwich, mercurial, qt, pyqt, etc, then we might as
well use the same mechanism to obtain a copy of the plugins.
>
> Which reminds me that we really want to run the tests for the supported
> series instead of just trunk (outside of pqm that is).
I thought pqm just ran on lucid?

>     >> >> 2 - push plugin authors to create series targeted at bzr releases: avoid
>     >> >> many maintenance issues :) This will also help installer
>     >> >> builders/packagers.
>     >> > For most plugins, this doesn't scale with the number of release series
>     >> > and the size of the plugins. It isn't worth the effort to maintain
>     >> > separate release series if it's trivial to be compatible with more
>     >> > versions of bzr.
>     >>
>     >> Balance to be found again, some plugins may just want to tag specific
>     >> revisions for a given series if they don't evolve a lot between series.
>     >>
>     >> > For plugins that are tightly coupled with particular bzr versions,
>     >> > like the foreign branch plugins, this is an option. But it still
>     >> > wouldn't have prevented the problems we had with the 2.5
>     >> > installers. Changes between beta 4 and beta 5 broke the foreign
>     >> > branch plugins, and the installers shipped with an outdated
>     >> > version of those plugins (from the correct release series).
>     >>
>     >> Sure, but at least the packagers can subscribe to the tip of a given
>     >> branch and be done.
>     > In practice neither of these seem to happen though.
>
> Huh ? Do you mean, the plugin authors don't participate enough or that
> packagers don't want to use these tricks ?
I'm not sure if it's "don't want". I think it's more of a time issue.
Both of these require significantly more time from plugin authors or
packagers.

E.g. bzr-rewrite, bzr-webdav or bzr-grep don't go and tag each time a
new bzr release is out,  because it usually just works with each release.

Requiring packagers to subscribe to the trunk of all the branches and
notice incompatible changes costs time too.

> qbzr maintain different series.
It's one of the notable exceptions though (together with bzr-svn).
> The osx installer (and the windows installer too AFAIK) has a script to
> download from a branch (tip) or a tag or a revid, so it's just a matter
> of providing either a branch (during betas) or a tag for stable
> releases.
We still had out of date plugins in the installer, though.

>
>
>     >> I think we agree far more than we disagree on most of the topics so
>     >> let's address the ones we agree on ;)
>     > Works for me ! :-) So, let's:
>
>     >  * Run "bzr selftest" and file bugs for issues we encounter
>     >  * Fix said bugs
>     >  * Run "bzr selftest" while we sleep
>     >  * Run "bzr selftest" during lunch break
>     >  * Run "bzr selftest" in the shower
>     >  * ...
>     >  * Can we run "bzr selftest" with a set of passing plugins installed on
>     > babune? We can start with just one and add more as we verify they pass
>     > the testsuite
>
> Yeah, that's what I was working on the last time I worked on babune bug
> fall into the rabbit hole encountering issues with subunit/testools that
> are currently mounted from my home directory instead of being properly
> deployed, the need is roughly the same (all slaves needs some deployment
> step before a job is run, the hacks I had in place start breaking when I
> upgraded to precise).
>
> Once this is fixed, the freebsd will regain some color and we can
> restart from there.
Which subunit/testtools issues are those? Perhaps we can do something to
get those fixed.

Cheers,

Jelmer



signature.asc (918 bytes) Download Attachment

Re: Release testing and the relationship between 'bzr selftest' and plugins

by Martin Packman :: Rate this Message:

| View Threaded | Show Only this Message

On 15/03/2012, Vincent Ladeuil <vila+bzr@...> wrote:
>
> On OSX, using a ram disk is a serious boost especially on laptops where
> hard disks are often of the 5400rpm specie.
>
> I think I had a recipe for that, let me dig...
>
> #!/bin/sh
> # Size in MB
> SIZE=${1:-512}

I use a 32MB ramdisk on win32 which is enough for a full selftest run,
now it's cleaning up after itself well enough. I've got a change that
makes bt.test_setup use a tempdir rather than clobbering things in
place which then wants much more space though.

Is there some neat way trying to create a big file with python to
catch ENOSPC and skip the test? Nix does fancy sparse things if you
just seek forwards and write something, and actually writing megabytes
of junk would be daft. It also seems unreliable:

    >>> file("B:/big", "w").write("\0" * ((1<<20) * 48))
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    IOError: [Errno 0] Error

Martin


Re: Release testing and the relationship between 'bzr selftest' and plugins

by Gordon Tyler :: Rate this Message:

| View Threaded | Show Only this Message

With the ramdisk, the same command took 31 minutes, which is a decent improvement.

BTW, Vincent, the reason the script multiplies by 2048 is because the unit is 512-byte sectors. :)

Ciao,
Gordon


Re: Release testing and the relationship between 'bzr selftest' and plugins

by Vincent Ladeuil-2 :: Rate this Message:

| View Threaded | Show Only this Message

>>>>> Jelmer Vernooij <jelmer@...> writes:

    > Am 16/03/12 10:55, schrieb Vincent Ladeuil:
    >> Still on this subject but from a slightly different pov, you maintain a
    >> set of recipes to build bzr and various plugins in the daily PPA.
    >>
    >> Since they all run the tests can you give some feedback about:
    >>
    >> - whether this is a good way to catch regressions,
    > I don't think it is. The plugins are only built when their tip changes,
    > so we usually don't notice breakage until somebody is also changing the
    > plugin (in which case they run the testsuite anyway).

Ok.

    > The other problem with it is that you get a single plain text file for
    > each plugin, with all build and test output. That makes it a bit awkward
    > to navigate the failures and to figure out what they are.

Ha, too bad, I thought you had a magic bullet for that...

    >> - how you proceed to filter out and extract the significant bits from
    >> the lp mails.
    > I don't :-)

Hehe.

    > I regularly go to
    > http://people.canonical.com/~jelmer/recipe-status/bzr.html
    > <http://people.canonical.com/%7Ejelmer/recipe-status/bzr.html> and
    > fix issues that show up there.

OMG ! The magic bullet !

Thanks, that was useful feedback,

        Vincent


Re: Release testing and the relationship between 'bzr selftest' and plugins

by Vincent Ladeuil-2 :: Rate this Message:

| View Threaded | Show Only this Message

>>>>> Gordon Tyler <gordon@...> writes:

    > My Mac took 38 minutes to run `bzr selftest --parallel=fork
    > --no-plugins`.

Not bad, any better with a ramdisk ? You can also try to raise the
concurrency with export BZR_CONCURRENCY=8 (That's what the babune's osx
slave is doing even with a dual-core).

Also, be aware that --no-plugins also disable the core plugins, if that
leads to test failures, that's bad and should be fixed though.

    > There were 3 failures and 27 errors. I'll have a look later to see
    > what's up with these errors and I'll also do another run with
    > plugins to see what happens.

Thanks, waiting for more feedback ;)

        Vincent

< Prev | 1 - 2 - 3 | Next >