|
View:
New views
10 Messages
—
Rating Filter:
Alert me
|
| < Prev | 1 - 2 - 3 - 4 - 5 - 6 - 7 - 8 - 9 - 10 | Next > |
|
|
Re: [xpressive] Performance Tuning?Simonson, Lucanus J wrote:
> Personally I perform several runs of the same benchmark and then take > the minimum time as the time I report. This excludes all outliers > due to OS and such. If a car company reported the average (mean or > median) 0 to 60 mph for a given car when their test driver was trying > to see how fast it was everyone would think they were crazy. Why > should the fact that he pushed the gas too hard on some of the trials > and spun out and not hard enough on others count against the car? I > also typically don't have the luxury of benchmarking on an unloaded > machine, so I have to make sure to do things fairly, for instance, > instead of runing many trials of A then many trials of B and taking > the minimum for each I run many trials of A then B back to back and > take the minimum. That way A and B on average se e the same > environement over the course of the experiement. If I run A and B in > the same process I run A first then B and then B first and then A as > a separate run to ensure that the order doesn't impact their > performance. > > Usually benchmarking implies comparision, so the key concern is that > the results are fair. This makes a lot of sense. Regards, -- Joel de Guzman http://www.boostpro.com http://spirit.sf.net _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost |
|
|
Re: [xpressive] Performance Tuning?Giovanni Piero Deretta wrote:
> *scratches head* > > I must be missing something very obvious, but I do not see how your > and OvermindDL benchmarks can support this conclusion: as far as I can > tell, you two never compared the two compilers directly, what you are > seeing is that on GCC the speed up of qi_parse over atoi/strtol is > less than that of MSVC, but this tells nothing about the absolute > performance of the two compilers on this test (i.e. you never showed > any absolute times). > > For what we know the gcc atoi might just be faster than msvc one. And > in fact a quick google search brings these two pages: > > http://tinyurl.com/mqa5yl [msvc8 atoi performance is 58% of that of msvc6] > http://tinyurl.com/mzyw66 [thread containing a comparison of atoi > functions of different languages and compilers, in particular it seems > that MSVC atoi is really 2 times slower than gcc atoi] > > A slow atoi on MSVC would explain such a difference in the tests, > assuming that the ability of both compilers to optimize qi_parse is > about the same. > > Finally, I see from the error messages that OvermindDL is using > gcc.3.4, which is now fairly old, maybe he should try with a more > recent one. Also -march=native or whatever architecture he is using > might help. Here are my tests with absolute numbers on one machine: MSVC 9: atoi_test: 6.8413257003 [s] strtol_test: 6.5168116888 [s] spirit_int_test: 2.0843406163 [s] gcc-4.3.2: atoi_test: 6.6410000000 [s] strtol_test: 6.2970000000 [s] spirit_int_test: 1.8280000000 [s] I'm starting to get impressed with g++ optimizations. If you want to verify the results, you can try it out from the boost SVN trunk at BOOST_ROOT/libs/spirit/benchmarks/qi/int_parser.cpp Here are my floating point tests: MSVC 9: atof_test: 8.4475158037 [s] strtod_test: 8.9236700525 [s] spirit_double_test: 2.9671239036 [s] gcc-4.3.2: atof_test: 12.4380000000 [s] strtod_test: 12.5940000000 [s] spirit_double_test: 2.7030000000 [s] Now that's rockin! In any case, this goes to show that Spirit numeric parsers are way faster than the corresponding low-level C routines. It shows that you can write extremely tight generic C++ code that rivals, if not surpasses C. Regards, -- Joel de Guzman http://www.boostpro.com http://spirit.sf.net _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost |
|
|
Re: [xpressive] Performance Tuning?>>> I ran the Visual Studio build again, still got about the same as
>>> I got >>> in my last email, so yea, Visual Studio is a lot better on templates >>> then GCC is. I wonder why GCC performs so much worse... >> >> Indeed - an open issue. Perhaps I will play with some other >> options (march >> etc.) to see if there's something I've overlooked. Well, I have >> to say I'm >> gaining more respect for the MS compiler - I last used it a very >> long time >> ago, back then it was awful! >> > > *scratches head* > > I must be missing something very obvious, but I do not see how your > and OvermindDL benchmarks can support this conclusion: as far as I can > tell, you two never compared the two compilers directly, what you are > seeing is that on GCC the speed up of qi_parse over atoi/strtol is > less than that of MSVC, but this tells nothing about the absolute > performance of the two compilers on this test (i.e. you never showed > any absolute times). As you can probably guess I don't really like absolute times. It is simple enough to try and do a cross comparison in a similar manner to the way relative values are currently calculated - it will just need a bit of tweaking, I'll give that some thought. > For what we know the gcc atoi might just be faster than msvc one. And > in fact a quick google search brings these two pages: > > http://tinyurl.com/mqa5yl [msvc8 atoi performance is 58% of that > of msvc6] > http://tinyurl.com/mzyw66 [thread containing a comparison of atoi > functions of different languages and compilers, in particular it seems > that MSVC atoi is really 2 times slower than gcc atoi] > > A slow atoi on MSVC would explain such a difference in the tests, > assuming that the ability of both compilers to optimize qi_parse is > about the same. Ah, good point - mea culpa! This is where my (diminishing) ignorance of computers gets highlighted - I was under the impression that the standard library calls would be calling the same underlying things for both compilers - clearly I'm mistaken. Thanks for the comment. My code was not originally intended for benchmarking compilers / platforms - it's just getting that way! -ed ------------------------------------------------ "No more boom and bust." -- Dr. J. G. Brown, 1997 _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost |
|
|
Re: Boost.Plot? - Scalable Vector Graphics (was [expressive] performance tuning[ Warning to all - the following is a long, complex response -- don't
bother reading unless you are *really* interested in the pedantic nuances of timing stuff! It may cause your eyes to glaze over and loss of consciousness. Apologies also for clogging up inboxes with pngs - be thankful for the mailing sentry bouncing this thing out until it got short enough! It just didn't read well with hyperlinked pictures. ] On 30 Jul 2009, at 23:21, Simonson, Lucanus J wrote: > Edward Grace wrote: >>> Personally I perform several runs of the same benchmark and then >>> take the minimum time as the time I report. >> >> Be very cautious about that for a number of reasons. >> >> a) By reporting the minimum, without qualifying your statement, you >> are lying. For example if you say "My code takes 10.112 seconds to >> run." people are unlikely to ever observe that time - it is by >> construction atypical. > > I try to be explicit about explaining what metric I'm reporting. > If I'm reporting the minimum of ten trials I report it as the > minimum of ten trials. Re-reading my comments, it looks a bit adversarial - it's not intended to be! I'm not accusing you of falsehood or dodgy dealings! Apologies if that was the apparent tone! > Taking the minimum of ten trials and comparing it to the minimum > of twenty trials and not qaulifying it would be even more dishonest > because it isn't fair. I compare same number of trails, report > that it is the minimum and the number of trails. > >> b) If ever you do this (use the minimum or maximum) then you are >> unwittingly sampling an extreme value distribution: >> >> http://en.wikipedia.org/wiki/Extreme_value_distribution >> >> For example - say you 'know' that the measured times are normally >> distributed with mean (and median) zero and a variance of 10. If you >> scenario. Notice that the histogram of the observed minimum is *far* >> broader than the observed median. > > In this particular case of runtime the distribution is usually not > normal. Of course, hence the quotes around 'know'. My demonstration of the effect of taking the minimum is based on the normal distribution since everyone knows it and it's assumed to be 'nice'. If you repeat what I did with another distribution (e.g. uniform, or Pareto) both of which have a well defined minimum you will still see the same effect! In fact, you are likely to observe even more extreme behaviour! I did not discuss more realistic distributions since it would cause people's eyes to glaze over even faster than this will! > There is some theoritical fastest the program could go if nothing > like the OS or execution of other programs on the same machine > cause it to run slower. This is the number we actually want. We > can't measure it, but we observe that there is a "floor" under our > data and the more times we run the more clear it becomes where that > floor is. Sure. This is partially a philosophical point regarding what one is trying to measure. In your case if you are trying to determine the fastest speed that the code could possibly go (and you don't have a real-time OS) then the minimum observed time looks like a good idea. I don't object to the use of the minimum (or any other estimator) per se, at least not yet. What one wishes to measure should determine the choice of estimator. For example, I do not want to know the minimum possible time as that is not indicative of the typical time. I want to compare the typical execution times of two codes *with* the OS taken in to account since, in the real world, this is what a user will notice and therefore what matters to me. Paraphrasing "Hamlet", 'To median or not to median - that is not quite the question.' My point is far more subtle than I think you have appreciated. The crux is that whatever your estimator you need to understand how it's distributed if you want to infer anything from it. So, if you *are* going to use the minimum (which is not robust) then you need to be aware that you may be inadvertently bitten by extreme value theory - consequently you should consider the form of the dispersion - it sure as heck won't be normal! This may seem bizarre - these things often are! To give you a concrete example, consider the following code (this also contains PDFs of the plots): http://tinyurl.com/n6kfe7 Here, we crudely time a simple function using a given number of iterations in a fairly typical manner. We repeat this timing 4001 times and then report the minimum and median of these 4001 acquisitions. This constitutes 'a measurement'. Each experimental 'measurement' is carried out for a randomly chosen number of iterations of the main timing loop and is hopefully independent. This is to try and satisfy the first i in iid (independent and identically distributed). For those of you bash oriented I do: ./ejg-make-input-for-empirical-ev > inputs.dat rm test.dat; while read line; do echo $line; echo $line | ./ejg- empirical-ev >> test.dat; done < inputs.dat No attempt is made to 'sanitise' the computer, nor do I deliberately load it. The raw results are shown below, The most obvious observation is that as the number of iterations increases the times both appear to converge in an asymptotic-like manner towards some 'true' value. This is clearly the effect of some overhead (e.g. the time taken to call the clock) having progressively less impact on the overall time. Secondly, the minimum observed time is lower than the median (as expected). What you may find a little surprising is that the *variation* of the minimum observed time is far greater than the variation in the median observed time. This is the key to my warning! If we crudely account for the overhead of the clock (which of course will also be stochastic - but let's ignore that) and subtract it then this phenomenon becomes even more stark. The variation of the observed times is now far clearer. Also, notice that rogue-looking red point in the bottom left of the main plot - for small numbers of iterations the dispersion of the minimum is massive compared to the dispersion of the median. You will also notice that there's now little difference between the predominant value of the median and minimum in this case. Now consider the inset, the second obvious characteristic is that the times do not appear to fluctuate entirely randomly with iteration number - instead there is obvious structure - indeed a sudden jump in behaviour around n=60. This is likely to be a strong interference effect between the clock and the function that's being measured and other architecture dependent issues (e.g. cache) that I won't pretend to fully understand. It's also indicative of how to interpret the observed variation - it's not necessarily purely random, but random + pseudorandom. Like std::rand(), if you keep the same seed (number of iterations) you may well observe the same result. That does not however mean that your true variance is zero - it's just hidden! Looking at the zoom in the inset it's clear to see that the median time has a generally far lower dispersion than the minimum time (by around an order of magnitude). There are even situations where the minimum time is clearly greater than the median time - directly counter to your expectation. This is not universally the case, but the general behaviour is perfectly clear. This can be considered to be a direct result of extreme value theory. If you use the minimum (or similar quantile) to estimate something about a random variable the result may have a large dispersion (variance) compared to the median. Just because you haven't seen it (changed the seed in your random number generator) doesn't mean it's not there! The practical amongst you may well say "Hey - who cares, we're talking about tiny differences!" I agree! So, is the moral of this tale to avoid using the minimum and instead to use the median? Not at all. I'm agnostic on this issue for reasons I mentioned above, however when / if one uses the minimum one needs to be careful and be aware of the (hidden) assumptions! For example, as various aspects of the timing change one may observe a sudden change in the cdf, 'magic' in the cache could perhaps cause the median measure to change far more than the minimum measure - so even though median may have better nominal precision, the minimum may well have better nominal accuracy. If you look in the code you'll see evidence that I've mucked around with different quantiles - I have not closed my mind to using the minimum! > It just won't run faster than that no matter how often we run. Have you tried varying the number of iterations? You'll find that for some iteration lengths it may go significantly faster - doubtless this is a machine architecture issue, I don't know why - but it happens. > The idea is that using the minimum as the metric eliminates most of > the "noise" introduced into the data by stuff unrelated to what you > are trying to measure. Taking the median is just taking the median > of the noise. Not necessarily - as you can see from the above plots. Just because your estimator is repeatable for a given number of iterations doesn't mean it's 'ok' or eliminated the 'noise' - it's far more subtle than that. > In an ideal compute environment your runtimes would be so similar > from run to run you wouldn't bother to run more than once. If > there is noise in your data set your statistic should be designed > to eliminate the noise, not characterize it. If you want to try and eliminate it you need to characterise it. This is why relative measurements are so important if you desire precision - the noise is always there, no matter what you do, so having 'well behaved' noise (e.g. asymptotically normal) means you can infer things in a meaningful manner when you take differences. If you have no idea what the form of your noise is - or think you've eliminated it you can get repeatable results but they may be a subtle distortion of the truth. By taking the minimum you are not necessarily avoiding the 'noise', you may think you are - indeed it may look like you are, but you could have just done a good job of hiding it! > In my recent boostcon presentation my benchmarks were all for a > single trial of each algorithm for each benchmark because noise > would be eliminated during the curve fitting of the data for > analysis of how well the algorithms scaled (if not by the > conversion to logarithmic axis....) As an aside you may find least absolute deviation (LAD) regression more appropriate than ordinary least squares (I'm assuming you used OLS of course). For practical application what you've done is almost certainly fine, 'good enough for government work' as the saying goes - I'm not suggesting it isn't. After all, as I have already commented, when human experimenters are involved we can eyeball the results and see when things look fishy or don't behave as expected. Ultimately - the take away message is apparently axiomatic 'truths' are not always as true as one might think. A dark art indeed! ;-) -ed ------------------------------------------------ "No more boom and bust." -- Dr. J. G. Brown, 1997 _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost |
|
|
Re: boost] Boost.Plot? - Scalable Vector Graphics (was [expressive] performance tuningEdward Grace wrote:
> -------------- next part -------------- > A non-text attachment was scrubbed... > Name: empirical-times.png > Type: image/png > Size: 8765 bytes > Desc: not available > URL: <http://lists.boost.org/MailArchives/boost/attachments/20090731/05a18e45/attachment.png> > -------------- next part -------------- This is just another of my once-or-twice-per-year messages to point out that these links don't work... Phil. _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost |
|
|
Re: boost] Boost.Plot? - Scalable Vector Graphics (was [expressive] performance tuningOn 31 Jul 2009, at 20:01, Phil Endecott wrote: > Edward Grace wrote: > >> -------------- next part -------------- >> A non-text attachment was scrubbed... >> Name: empirical-times.png >> Type: image/png >> Size: 8765 bytes >> Desc: not available >> URL: <http://lists.boost.org/MailArchives/boost/attachments/ >> 20090731/05a18e45/attachment.png> >> -------------- next part -------------- > > This is just another of my once-or-twice-per-year messages to point > out > that these links don't work... I'm not quite sure I understand. Once I pared down the attachments my message should have had a couple of .png attachments. Did this get replaced by a non-functional URL? The graphs are quite important, without them nothing makes sense [*]. I don't get an echo of what I post so I can't tell. -ed [*] That doesn't imply that it will make sense with them of course! _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost |
|
|
Re: boost] Boost.Plot? - Scalable Vector Graphics (was [expressive] performance tuningOn Fri, Jul 31, 2009 at 2:37 PM, Edward Grace<ej.grace@...> wrote:
> > On 31 Jul 2009, at 20:01, Phil Endecott wrote: > >> Edward Grace wrote: >> >>> -------------- next part -------------- >>> A non-text attachment was scrubbed... >>> Name: empirical-times.png >>> Type: image/png >>> Size: 8765 bytes >>> Desc: not available >>> URL: >>> <http://lists.boost.org/MailArchives/boost/attachments/20090731/05a18e45/attachment.png> >>> -------------- next part -------------- >> >> This is just another of my once-or-twice-per-year messages to point out >> that these links don't work... > > I'm not quite sure I understand. Once I pared down the attachments my > message should have had a couple of .png attachments. Did this get replaced > by a non-functional URL? The graphs are quite important, without them > nothing makes sense [*]. > > I don't get an echo of what I post so I can't tell. You pictures came through fine, he is looking on the website archive. It has nothing to do with you, he is just complaining that the archives are broken since it does not include attachments, but us using email distribution got your images fine, so no worry. _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost |
|
|
Re: boost] Boost.Plot? - Scalable Vector Graphics (was [expressive] performance tuningOvermindDL1 wrote:
> On Fri, Jul 31, 2009 at 2:37 PM, Edward Grace<ej.grace@...> wrote: >> On 31 Jul 2009, at 20:01, Phil Endecott wrote: >>> Edward Grace wrote: >>> >>>> -------------- next part -------------- >>>> A non-text attachment was scrubbed... >>>> Name: empirical-times.png >>>> Type: image/png >>>> Size: 8765 bytes >>>> Desc: not available >>>> URL: >>>> <http://lists.boost.org/MailArchives/boost/attachments/20090731/05a18e45/attachment.png> >>>> -------------- next part -------------- >>> >>> This is just another of my once-or-twice-per-year messages to point out >>> that these links don't work... >> >> I'm not quite sure I understand. ?Once I pared down the attachments my >> message should have had a couple of .png attachments. ?Did this get replaced >> by a non-functional URL? ?The graphs are quite important, without them >> nothing makes sense [*]. >> >> I don't get an echo of what I post so I can't tell. > > You pictures came through fine, he is looking on the website archive. > It has nothing to do with you, he is just complaining that the > archives are broken since it does not include attachments, but us > using email distribution got your images fine, so no worry. No, I'm not looking at the website archive, I'm looking at the list digest email. Phil. _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost |
|
|
Re: boost] Boost.Plot? - Scalable Vector Graphics (was [expressive] performance tuning> -----Original Message-----
> From: boost-bounces@... [mailto:boost-bounces@...] On > Behalf Of Phil Endecott > Sent: 01 August 2009 16:34 > To: boost@... > Subject: Re: [boost] boost] Boost.Plot? - Scalable Vector Graphics (was [expressive] > performance tuning > > No, I'm not looking at the website archive, I'm looking at the list digest email. FWIW, the email and attachments are all fine for me - using Outlook 2007 and Firefox 3.5 and Adobe Reader. Paul PS But I note that my original post on SVG has been hijacked - not that it matters much. --- Paul A. Bristow Prizet Farmhouse Kendal, UK LA8 8AB +44 1539 561830, mobile +44 7714330204 pbristow@... _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost |
|
|
Re: [xpressive] Performance Tuning?On 28 Jul 2009, at 11:46, Edward Grace wrote: > > On 28 Jul 2009, at 18:17, Joel de Guzman wrote: > >> Edward Grace wrote: >> >>>> That is a *lot* more reasonable, although Spirit is still most >>>> definitely faster then the built-in functions. :) >>> >>> That's good though - one up for Boost! >> >> My latest benchmarks for integers and floating points reveal >> a 3x speed over atoi/strtol and friend C functions. You >> mentioned a need to parse small numbers very quickly? Spirit >> does. The tests I have take that into consideration too. If you >> guys want to take a peek, it's in the Boost trunk in libs/benchmarks. > > Sure. Can you give me an exact url? The last time I went on a code > hunt in SVN I found the wrong thing. > >> Some numbers: >> >> /////////////////////////////////////////////////////////////////////////// >> atoi_test: 0.9265067422 [s] {checksum: d5b76d60} >> strtol_test: 1.0766213977 [s] {checksum: d5b76d60} >> spirit_int_test: 0.3097019879 [s] {checksum: d5b76d60} >> >> /////////////////////////////////////////////////////////////////////////// >> atof_test: 7.3012049917 [s] {checksum: 3b7d82b0} >> strtod_test: 8.0042894122 [s] {checksum: 3b7d82b0} >> spirit_double_test: 2.6729373333 [s] {checksum: 3b7d82b0} >> >> This time, I am using the benchmarking harness by David Abrahams, >> Matthias Troyer, Michael Gauckler. > > This? > > http://tinyurl.com/kk858o > > There's some interesting trickery in there by the looks of things > for eliminating the optimiser nastiness - that's not something I've > thought about much I'll take a look. > > In the comments, > > 42 // operation to at least update the L1 cache. *** Note: This > 43 // concern is specific to the particular application at which > 44 // we're targeting the test. *** > > that seems quite important but a little opaque out of context. We did a loop over multiple accumulators in this test since this was the application scenario. We wanted to measure the abstraction penalty of using the Boost.Parameter library, but wanted it in a scenario where the functions called at least access the L1 cache so that we are not influenced by non-realistic too much simplified code that might be optimized too much. > One thing I take exception to is the (effective) use of the mean as > a measurement of central tendency - perhaps their trickery has > eliminated the heavy tail. I'll have to take a look and see how it > compares to my approach. Yes, we wanted to eliminate irrelevant heavy tails mainly by running the test multiple times. Why should the cost of e.g. swapping Word and Excel out of main memory to get space to run your test be measured as part of your test? This is a cost that *any* program might have to pay at times, no matter which algorithm you are comparing. I do not want such extreme events that are outside of my program's control mess up the comparisons. Another comment about heavy tails: if they are there then I want to analyze them and understand them. If they are absent then the mean is fine. We typically run the benchmark multiple times to eliminate heavy tails caused by swapping, etc.. > How do your relative timings compare if you repeat them while (say) > watching a DVD? [*] Again, if you carte about performance of codes while you wach a DVD then that is just the benchmark you should run. My codes typically run while I am not watching a DVD and hence I do not repeat them while watching a DVD. > [*] This may seem a perverse question. I'm interested in robust > performance measurement, in other words accurately working out which > function is fastest while the machine is under choppy loading -- not > a sanitised testing environment - so the fastest function can be > selected by the code itself. Again, my codes, and most performance-sensitive codes that I know of run in a rather "sanitized" environment without users watching DVDs or playing games on the machine while they run. If you want tests under choppy load then you have to provide a "sanitized" and reproducible choppy load environment. Matthias _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost |
| < Prev | 1 - 2 - 3 - 4 - 5 - 6 - 7 - 8 - 9 - 10 | Next > |
| Free embeddable forum powered by Nabble | Forum Help |