Linux Tp numbers on mozilla-central varying too much to be useful

View: New views
3 Messages — Rating Filter:   Alert me  

Linux Tp numbers on mozilla-central varying too much to be useful

by L. David Baron :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

So, as today's sheriff, I started looking at performance numbers on
tinderbox.  I looked at the results of the pageload test on Linux.

We're running this test on 5 machines.  These machines each have a
number of spikes in their graphs, but the spikes don't seem
correlated with each other:
http://graphs.mozilla.org/graph.html#show=911694,395125,395135,395166,1431032

This makes it seem like these numbers aren't useful at all, since
it's hard to know which ones are trustworthy and which are not.  Do
we have any idea what's going on here?

I'm particularly worried about two things:

(1) The three original machines have been much less correlated with
each other since around the beginning of July.

(2) Machines have been spiking independently of each other since
around the 21st of August.

-David

--
L. David Baron                                 http://dbaron.org/
Mozilla Corporation                       http://www.mozilla.com/
_______________________________________________
dev-performance mailing list
dev-performance@...
https://lists.mozilla.org/listinfo/dev-performance

Parent Message unknown Re: Linux Tp numbers on mozilla-central varying too much to be useful

by alice nodelman :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

1) qm-plinux-trunk01/02/03 should be in agreement, looks to me that
qm-plinux-trunk02 is reporting somewhat high.  I've investigated this
before and haven't been able to find any reason for its high results -
but I'll take another look and see if I can figure it out.

2) I think that most of the spikes that are are seeing are associated
with throttling being flipped on/off - this shouldn't be an issue
anymore due to work done on the talos images to keep them correctly
configured.

3) We've obviously had some trouble maintaining the results of these
machines - there's a big zone between 7/1 and 8/19 where all the numbers
were constantly increasing.  This was unfortunate and took a long time
to get on anyones radar.  I'm hoping that with more vigilant monitoring
of the numbers (and various Q4 goals surrounding auto-monitoring for
regressions) we won't end up with such large gaps in our knowledge.  I
think that we'll get better value by looking at the most current results
and ensuring that they make sense.

alice.

L. David Baron wrote:

> So, as today's sheriff, I started looking at performance numbers on
> tinderbox.  I looked at the results of the pageload test on Linux.
>
> We're running this test on 5 machines.  These machines each have a
> number of spikes in their graphs, but the spikes don't seem
> correlated with each other:
> http://graphs.mozilla.org/graph.html#show=911694,395125,395135,395166,1431032
>
> This makes it seem like these numbers aren't useful at all, since
> it's hard to know which ones are trustworthy and which are not.  Do
> we have any idea what's going on here?
>
> I'm particularly worried about two things:
>
> (1) The three original machines have been much less correlated with
> each other since around the beginning of July.
>
> (2) Machines have been spiking independently of each other since
> around the 21st of August.
>
> -David
>
_______________________________________________
dev-performance mailing list
dev-performance@...
https://lists.mozilla.org/listinfo/dev-performance

Re: Linux Tp numbers on mozilla-central varying too much to be useful

by Ray Kiddy-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

alice nodelman wrote:

> 1) qm-plinux-trunk01/02/03 should be in agreement, looks to me that
> qm-plinux-trunk02 is reporting somewhat high.  I've investigated this
> before and haven't been able to find any reason for its high results -
> but I'll take another look and see if I can figure it out.
>
> 2) I think that most of the spikes that are are seeing are associated
> with throttling being flipped on/off - this shouldn't be an issue
> anymore due to work done on the talos images to keep them correctly
> configured.
>
> 3) We've obviously had some trouble maintaining the results of these
> machines - there's a big zone between 7/1 and 8/19 where all the numbers
> were constantly increasing.  This was unfortunate and took a long time
> to get on anyones radar.  I'm hoping that with more vigilant monitoring
> of the numbers (and various Q4 goals surrounding auto-monitoring for
> regressions) we won't end up with such large gaps in our knowledge.  I
> think that we'll get better value by looking at the most current results
> and ensuring that they make sense.
>
> alice.
>

Just to put a cap on this thread, so it does look as if the issue was
just left hanging, I assume that this question led to the recent
decision to reboot the test machines after every build.

http://bugzilla.mozilla.org/show_bug.cgi?id=463020

cheers - ray


> L. David Baron wrote:
>> So, as today's sheriff, I started looking at performance numbers on
>> tinderbox.  I looked at the results of the pageload test on Linux.
>>
>> We're running this test on 5 machines.  These machines each have a
>> number of spikes in their graphs, but the spikes don't seem
>> correlated with each other:
>> http://graphs.mozilla.org/graph.html#show=911694,395125,395135,395166,1431032 
>>
>>
>> This makes it seem like these numbers aren't useful at all, since
>> it's hard to know which ones are trustworthy and which are not.  Do
>> we have any idea what's going on here?
>>
>> I'm particularly worried about two things:
>>
>> (1) The three original machines have been much less correlated with
>> each other since around the beginning of July.
>>
>> (2) Machines have been spiking independently of each other since
>> around the 21st of August.
>>
>> -David
>>
_______________________________________________
dev-performance mailing list
dev-performance@...
https://lists.mozilla.org/listinfo/dev-performance