|
View:
New views
3 Messages
—
Rating Filter:
Alert me
|
|
|
Faster bilinear scalingHi,
This branch: http://cgit.freedesktop.org/~sandmann/pixman/log/?h=bilinear contains a fast path for fetching of bilinearly filtered, scaled images. It is basically Andre's work, described here: http://lists.cairographics.org/archives/cairo/2008-December/016170.html What I did was - Update scaling-test to also test bilinear scaling - Remove bilinear_interpolation_left/right() functions in favor of just calling bilinear_interpolation(). - Fix coding style. The performance improvement for the swfdec-youtube benchmark on a 3.8GHz P4 is around 17%: Before: [ # ] backend test min(s) median(s) stddev. count [ 0] image swfdec-youtube 8.375 8.431 0.44% 6/6 After: [ # ] backend test min(s) median(s) stddev. count [ 0] image swfdec-youtube 6.942 7.019 0.61% 6/6 Much of the profile of this benchmark is in radial gradients, so other users of bilinear scaling may see more improvement. Also, if anyone is interested in adding support for SIMD acceeleration of fetchers, the the bilinear_interpolation() function is begging to be written with SSE2 or NEON. Comments welcome. Thanks, Soren _______________________________________________ cairo mailing list cairo@... http://lists.cairographics.org/mailman/listinfo/cairo |
|
|
Re: Faster bilinear scalingOn Tuesday 06 October 2009, Soeren Sandmann wrote:
> Hi, > > This branch: > > http://cgit.freedesktop.org/~sandmann/pixman/log/?h=bilinear > > contains a fast path for fetching of bilinearly filtered, scaled > images. It is basically Andre's work, described here: > > > http://lists.cairographics.org/archives/cairo/2008-December/016170.html > > What I did was > > - Update scaling-test to also test bilinear scaling > > - Remove bilinear_interpolation_left/right() functions in > favor of just calling bilinear_interpolation(). > > - Fix coding style. Nice, any improvements in this area are very much welcome. > The performance improvement for the swfdec-youtube benchmark on a > 3.8GHz P4 is around 17%: > > Before: > > [ # ] backend test min(s) median(s) stddev. > count [ 0] image swfdec-youtube 8.375 8.431 0.44% > 6/6 > > After: > > [ # ] backend test min(s) median(s) stddev. > count [ 0] image swfdec-youtube 6.942 7.019 0.61% > 6/6 > > Much of the profile of this benchmark is in radial gradients, so other > users of bilinear scaling may see more improvement. More specialized benchmarks would be nice to see too. For example benchmark scaling 99x99 to 101x101 and compare it to a simple copy of 100x100 image. That would give an estimate about how much this operation is memory throughput limited and how much it can be potentially improved. > Also, if anyone is interested in adding support for SIMD acceeleration > of fetchers, the the bilinear_interpolation() function is begging to > be written with SSE2 or NEON. This can be tried indeed. Also an alternative option for the bilinear filter can be to have two temporary fetch buffers, don't do any kind of interpolation in the fetcher, but put pairs of pixels into these buffers. Then do interpolation in a bulk. Full width of SIMD registers may be utilized better in this case. Interpolation can be also combined with some compositing operation, for example OVER is the primary candidate. Another variation of this is to do horizontal interpolation first and put partly processed data into two temporary buffers. A possible advantage of this approach is that horizontally interpolated data can be reused multiple times quite often, especially when upscaling. There are many things to try. It also can happen that optimal implementations may be different for different platforms. But as long as the code is well covered by regression tests, having more than one implementation should not be a problem. -- Best regards, Siarhei Siamashka _______________________________________________ cairo mailing list cairo@... http://lists.cairographics.org/mailman/listinfo/cairo |
|
|
Re: Faster bilinear scalingSiarhei Siamashka <siarhei.siamashka@...> writes:
> More specialized benchmarks would be nice to see too. For example benchmark > scaling 99x99 to 101x101 and compare it to a simple copy of 100x100 image. > That would give an estimate about how much this operation is memory throughput > limited and how much it can be potentially improved. Indeed, bilinear scaling may be a case where we would not actually be memory bound, even with an SSE or NEON implementation. > > Also, if anyone is interested in adding support for SIMD acceeleration > > of fetchers, the the bilinear_interpolation() function is begging to > > be written with SSE2 or NEON. > > This can be tried indeed. There is a number of cases where support for implementation defined fetchers would be useful, including gradients and the two SSE2 fetchers in bugzilla that Steve Snyder wrote. > But as long as the code is well covered by regression tests, having > more than one implementation should not be a problem. Well, sometimes regression tests cause people to get sloppy and assume that their code works just because no tests failed. Generally, I think careful code review is desirable along with regression tests. I went ahead and merged the bilinear optimizations. Soren _______________________________________________ cairo mailing list cairo@... http://lists.cairographics.org/mailman/listinfo/cairo |
| Free embeddable forum powered by Nabble | Forum Help |