|
View:
New views
3 Messages
—
Rating Filter:
Alert me
|
|
|
time stamps of talos performance results & finding regressionsSince the beginning of the talos project results gathered during a test
run have been collected and sent off to the graph server (be it graphs.mozilla.org or graphs-stage.mozilla.org) and stored in a databased keyed on a time stamp. When this system was first put in place it was mostly viewed in isolation without much investigation as to how much data we were going to be collecting and how is was going to be used. Merely by default I ended up using talos testrun time as the time stamp for sets of results as I could guarantee that it would be unique and always advancing. As it turns out, time stamping with testrun time is less than ideal. Tests are only run once a build is completed and a talos machine becomes free for testing, meaning that testrun time ends up being 2-3 hours (or more) after build start time. Determining what build is being used at a given timestamp on graph server is non-trivial. You have to backtrack from testrun time to build start time to bonsai checkins. Adding a little wiggle room on each of these makes the regression range for a result end up being 2-3 hours. This means that there is a lot of check-ins that have to be investigated, and possibly backed-out, simply because it is so hard to correlate tinderbox waterfall information with graph server information. To further complicate matters, there's a bunch of infrastructure workarounds in place trying to make the tinderbox waterfall page display build results and talos results in a way that lines up, even though its not strictly accurate. Bug#419487 (change buildbot & talos to use buildtime, not testrun time) changes talos, so that the time stamp used for talos results for a given build would no longer be the testrun time, but now be the build start time as reported by the waterfall for that build. If you see a regression you could look at the specific build that caused it - instead of doing the mental gymnastics and adding hours to the regression range before and after the reported talos result. So, what's the downside? This would be going forward only. For existing data we do not have anything in place to re-time stamp results that are already in a graph server database (and there is some question that it may not be possible to do so at all). For now, the plan is that we would end up drawing a line in the sand and say "For historic results, we need to continue requiring large regression ranges, but from now onwards, results can be pinpointed". Overall, I think this is a great improvement on our infrastructure. It removes some daily complexity in how we debug regressions and determine what patches need to be backed out. It also brings our build and talos systems into sync with each other. I want to get as much feedback as I can on this by people who frequently use talos results to find regressions. Please respond to Bug#419487 so that all the discussion ends up in the same place. Thanks, alice. _______________________________________________ dev-performance mailing list dev-performance@... https://lists.mozilla.org/listinfo/dev-performance |
|
|
Re: time stamps of talos performance results & finding regressionsOn Wed, May 14, 2008 at 8:48 PM, alice nodelman <anodelman@...> wrote:
> Overall, I think this is a great improvement on our infrastructure. It > removes some daily complexity in how we debug regressions and determine > what patches need to be backed out. It also brings our build and talos > systems into sync with each other. I agree, and think that the gains are very much worth the discontinuity. Excellent! Mike _______________________________________________ dev-performance mailing list dev-performance@... https://lists.mozilla.org/listinfo/dev-performance |
|
|
|
| Free embeddable forum powered by Nabble | Forum Help |