Test suite

View: New views
7 Messages — Rating Filter:   Alert me  

Test suite

by J.B.C.Engelen :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi all,

Could someone have a look at the testsuite and reprogram it such that
(at least when perceptualdiff is used) the results that are marked 'new'
are marked 'fail' instead? This is much clearer, and cleans up the daily
checks at http://auriga.mine.nu/inkscape/.
Thank you very very much!

I love the testsuite, but I don't want to delve in to the code and
change things myself.

Thanks a bunch,
  Johan

------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Inkscape-devel mailing list
Inkscape-devel@...
https://lists.sourceforge.net/lists/listinfo/inkscape-devel

Re: Test suite

by Jasper van de Gronde :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

NO! There is a definite and extremely important difference between new
and fail. Perceptualdiff only helps with images that really are almost
exactly the same. It does NOT help when:
  - A Bezier curve is slightly (but benignly) perturbed (which has
    happened) or other small but for a human insignificant (as far as the
    correctness of the render is concerned) changes occur (for example
    changes in resampling of bitmaps).
  - A complicated test case was judged as 'pass' incorrectly.
  - A 'new' result is actually a 'pass' (for example when there is no
    pass reference yet).
  - Something changes that is unrelated to the specific test. For
    example, there are a few filter tests that register as passes because
    Inkscape indeed implements them correctly but that still do not
    render correctly because Inkscape doesn't implement the
    color-interpolation properties. (It would be better to change the
    tests of course, but still, stuff like this happens.)

In short, perceptualdiff is nowhere near a true substitute for a human
judge. It is great for filtering out spurious results based on minute
numerical differences and/or differences in the binary encoding of the
pngs, but that's about it.

For this reason the system was set up specifically to allow for multiple
pass/fail references and flag anything it can't match as a new result.
In the past I could easily keep up with judging any new results because
don't occur very frequently, but recently a lot of tests suddenly had
new results (probably because of changes in bitmap rendering) and since
I was/am way too busy I was unable to rejudge them myself. At the time I
sent a mail about this, but no one responded.

So: PLEASE just judge these new results once (probably won't take too
long) and then the results will be like normal again.

P.S. The system is set up so that if there are two (or more) results in
one day it only displays the last, that's why hardly any new results
show up in the history of the results (I'd run the tests, rejudge any
new results, if any, and then rerun the tests).

J.B.C.Engelen@... wrote:

> Hi all,
>
> Could someone have a look at the testsuite and reprogram it such that
> (at least when perceptualdiff is used) the results that are marked 'new'
> are marked 'fail' instead? This is much clearer, and cleans up the daily
> checks at http://auriga.mine.nu/inkscape/.
> Thank you very very much!
>
> I love the testsuite, but I don't want to delve in to the code and
> change things myself.
>
> Thanks a bunch,
>   Johan
>
>


------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Inkscape-devel mailing list
Inkscape-devel@...
https://lists.sourceforge.net/lists/listinfo/inkscape-devel

Re: Test suite

by J.B.C.Engelen :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

** If there is someone out there who likes to make website stuff, please
help with our test suite ! **

It's a useability issue and depends too much on one person. Right now,
we have many "new" results. A result that is obviously different from
the pass reference is marked "new", is something only marked "fail" when
it is equal to a fail reference? Perhaps there is a program that gives a
measure of how equal images are, instead of simply "equal"/"not equal"?
It would be very nice if the system didn't need much user intervention,
or if that intervention would be very easy. (e.g. website based)

> So: PLEASE just judge these new results once (probably won't take too
> long) and then the results will be like normal again.

What action does it take to judge? Should one commit a new fail or
pass-reference? The website's system is different so the new fail/pass
ref should be generated by the website too, otherwise the result will
still be flagged "new". This means that for example I cannot do it.

You know I am a big fan of the testsuite, but for some reason, there are
very few people using it. IMHO it would be much better if people would
add testfiles to the testsuite instead of adding them to the bugtracker,
but it doesn't happen. Perhaps clearifying the result listing can help.
I don't know...

Ciao,
  Johan

> -----Original Message-----
> From: Jasper van de Gronde [mailto:th.v.d.gronde@...]
> Sent: Tuesday, October 20, 2009 10:25
>
> NO! There is a definite and extremely important difference
> between new and fail. Perceptualdiff only helps with images
> that really are almost exactly the same. It does NOT help when:
>   - A Bezier curve is slightly (but benignly) perturbed (which has
>     happened) or other small but for a human insignificant
> (as far as the
>     correctness of the render is concerned) changes occur (for example
>     changes in resampling of bitmaps).
>   - A complicated test case was judged as 'pass' incorrectly.
>   - A 'new' result is actually a 'pass' (for example when there is no
>     pass reference yet).
>   - Something changes that is unrelated to the specific test. For
>     example, there are a few filter tests that register as
> passes because
>     Inkscape indeed implements them correctly but that still do not
>     render correctly because Inkscape doesn't implement the
>     color-interpolation properties. (It would be better to change the
>     tests of course, but still, stuff like this happens.)
>
> In short, perceptualdiff is nowhere near a true substitute
> for a human judge. It is great for filtering out spurious
> results based on minute numerical differences and/or
> differences in the binary encoding of the pngs, but that's about it.
>
> For this reason the system was set up specifically to allow
> for multiple pass/fail references and flag anything it can't
> match as a new result.
> In the past I could easily keep up with judging any new
> results because don't occur very frequently, but recently a
> lot of tests suddenly had new results (probably because of
> changes in bitmap rendering) and since I was/am way too busy
> I was unable to rejudge them myself. At the time I sent a
> mail about this, but no one responded.
>
> So: PLEASE just judge these new results once (probably won't take too
> long) and then the results will be like normal again.
>
> P.S. The system is set up so that if there are two (or more)
> results in one day it only displays the last, that's why
> hardly any new results show up in the history of the results
> (I'd run the tests, rejudge any new results, if any, and then
> rerun the tests).
 

------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Inkscape-devel mailing list
Inkscape-devel@...
https://lists.sourceforge.net/lists/listinfo/inkscape-devel

Re: Test suite

by Jasper van de Gronde :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

J.B.C.Engelen@... wrote:
> ** If there is someone out there who likes to make website stuff, please
> help with our test suite ! **
> ...
> Perhaps there is a program that gives a
> measure of how equal images are, instead of simply "equal"/"not equal"?
> It would be very nice if the system didn't need much user intervention,
> or if that intervention would be very easy. (e.g. website based)

During the time I did it I only had to rejudge images occasionally, and
you only have to look at the output image and move it to either the pass
or the fail references (and then commit). But yes, it would be great to
have a web interface for this, especially for users who are unfamiliar
with SVN.

> ...
> You know I am a big fan of the testsuite, but for some reason, there are
> very few people using it. IMHO it would be much better if people would
> add testfiles to the testsuite instead of adding them to the bugtracker,
> but it doesn't happen.

Indeed, I greatly appreciate the effort you've put into making new

> Perhaps clearifying the result listing can help.
> I don't know...

The Wiki is currently down so I can't check the exact URL (probably just
http://www.inkscape.org/wiki/index.php/TestingInkscape as linked to from
the testsuite result page), but there is quite a bit of documentation on
the Wiki on testing Inkscape (both unit tests and rendering tests). In
short:

Inkscape has unit tests to (mostly) test low-level functionality which
are run using make check (or the Windows equivalent) and implemented in
code using CxxTest.

In addition Inkscape has rendering tests to test high level
functionality. These tests are run using runtests.py in the test
repository (in Inkscape's SVN) and consist of a test SVG and references
PNGs. For each test there can be any number of fail and/or pass
references to which the test program tries to match the output. As
Inkscape's output doesn't change too often this makes it relatively easy
to keep up with any changes that do occur manually. Especially when
combined with perceptualdiff to filter out really trivial changes.

Also, instead of a pass reference another SVG file (called a patch file)
can be given which Inkscape should render in exactly the same way but
which IS rendered correctly. This can make comparisons slightly more
robust and enables a pass reference to be made before Inkscape actually
passes a test.


------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Inkscape-devel mailing list
Inkscape-devel@...
https://lists.sourceforge.net/lists/listinfo/inkscape-devel

Re: Test suite

by J.B.C.Engelen :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> -----Original Message-----
> From: Jasper van de Gronde [mailto:th.v.d.gronde@...]
> Sent: Wednesday, October 21, 2009 15:44
> To: Engelen, J.B.C. (Johan)
> Cc: inkscape-devel@...
> Subject: Re: Test suite
>
> Also, instead of a pass reference another SVG file (called a
> patch file) can be given which Inkscape should render in
> exactly the same way but which IS rendered correctly. This
> can make comparisons slightly more robust and enables a pass
> reference to be made before Inkscape actually passes a test.

Yeah I did that. It works well. However, if the test fails, it was
marked "new" for me. (which is why I proposed to change "new" to
"fail".)

I think the testing itself is very well documented, but it needs to get
more exposure and it needs someone to work on the result output.
(cxxtests should be online as well for example!)

- Johan

------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Inkscape-devel mailing list
Inkscape-devel@...
https://lists.sourceforge.net/lists/listinfo/inkscape-devel

Re: Test suite

by jf barraud :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,
Thanks, now I understand the meaning of "new" and what are "fail references" made for!
I'm not involved in software develeopement nor CS in the large, and to tell you the truth, this was really unclear to me.
This should be explained somewhere online (maybe it is already and I missed it), or made clear from reading the results themselves.

So I support Johan suggestion; what about replacing "new" by "fail", and "fail" by "fail (known)", with a little caption explaining that known failures are those matching a "fail reference"?

(and needless to say, I definitely agree the test suite is vital for inkscape developpement and maintainance and should be better known so that more people contribute test cases)

Cheers.
------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Inkscape-devel mailing list
Inkscape-devel@...
https://lists.sourceforge.net/lists/listinfo/inkscape-devel

Re: Test suite

by Jasper van de Gronde :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

jf barraud wrote:
> Hi,
> Thanks, now I understand the meaning of "new" and what are "fail
> references" made for!
> I'm not involved in software develeopement nor CS in the large, and to
> tell you the truth, this was really unclear to me.
> This should be explained somewhere online (maybe it is already and I
> missed it), or made clear from reading the results themselves.

Perhaps a simple legend? (Can't think why I didn't include one in the
first place.)

> So I support Johan suggestion; what about replacing "new" by "fail", and
> "fail" by "fail (known)", with a little caption explaining that known
> failures are those matching a "fail reference"?

As such I think it migh be a bit misleading, but Johan does have some
ideas to perhaps reduce the number of new results, so who knows.

And if the term "new" is misleading, perhaps something else, like
"unknown" might be better?

BTW, the project I was feverishly working on over the past few months
(http://2009.igem.org/Team:Groningen) is finally (almost) over! So I
might actually be contributing to Inkscape (for example to the test
suite) again in the near future. (A lot of the graphics on our Wiki were
made with Inkscape btw.)

------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Inkscape-devel mailing list
Inkscape-devel@...
https://lists.sourceforge.net/lists/listinfo/inkscape-devel