|
View:
New views
5 Messages
—
Rating Filter:
Alert me
|
|
|
[v3] parallel modeI'm delighted to announce a parallel implementation of many C++ Standard library algorithms, which been integrated into libstdc++ as an experimental mode that will henceforth be known as the libstdc++ parallel mode. These parallel algorithms use OpenMP as a thread layer, and should thus be portable to hardware/os combinations that support libgomp. This work was graciously donated to the FSF by Johannes Singler and Leonor Frias, with the support of the University of Karlsruhe. Assisting were Felix Putze, Marius Elvert, Felix Bondarenko, Robert Geisberger, Robin Dapp, and myself. In addition, Johannes has agreed to continuing to hack on this functionality. This is designed as another specialized mode for libstdc++, similar to the debug mode, only for code making use of parallel algorithms. Usage is pretty simple. C++ code that uses the STL can recompile with -D_GLIBCXX_PARALLEL -fopenmp, and get parallized versions of the usual STL bits. Or, algorithms can be injected explicitly by qualifying as parallel, like so: __gnu_parallel::transform. More details are here: http://people.redhat.com/bkoz/parallel_mode/parallel_mode.html Which has also been aded to the libstdc++ documentation. There are some areas that require more thought and tuning: the efficiency of parallel execution in the subtle interplay between threading overhead, input size, behavior of user-defined parts, hardware limitations (memory bandwidth) and so on. Much tuning needs to be done. Documentation needs to be improved, etc. This effort and research will be on-going: suggestions and help are welcome. The current testing status is excellent, passing both the usual checks (make check) and a new make rule (make check-parallel) that runs the libstdc++ conformance tests with -D_GLIBCXX_PARALLEL -fopenmp. For the usual checks, (make check) results are identical, and for parallel mode there are no unexpected fails. Performance testing is on-going, but all the usual performance tests build and run. Because of these testing results, the recent go status from the FSF on assignments, and the stage 3 deadline, I'd like to put this in now. Certainly, having a central SVN repository will help development. Jakub has reviewed recent versions of this code (and provided some feedback on testing approaches for earlier versions), and has some suggestions about OpenMP usage. We certainly intend to address these concerns, but would prefer to do this after the code is checked in. Ulrich has also reviewed earlier versions of this code: I believe all his usage suggestions have already been incorporated. tested x86/linux tested x86_64/linux -benjamin |
|
|
Re: [v3] parallel modeThis is now in. The final patch is as attached. Differences from what was previously posted: 1) --disable-gomp now works 2) __gnu_sequential doxygen markup fix 3) added check-performance-parallel 4) moved mismatch into parallel/algobase.h, from parallel/algo.h. There is one regression in parallel mode, in 25_algo/lexicographical_compare. Everything else is fine. tested x86/linux tested x86_64/linux tested parallel x86/linux tested parallel x86_64/linux tested performance x86/linux tested performance-parallel x86/linux tested x86/darwin I'm still sorting out the darwin parallel testing. best, benjamin |
|
|
Re: [v3] parallel modeQuick question: -pedantic (well, -pedantic-errors actually) give's: include/c++/4.3.0/parallel/partial_sum.h:107: error: ISO C++ forbids variable length array ‘borders’ the offending line being: difference_type borders[num_threads + 2]; Is this necessary? What side-effects would a vector cause? If, as a hack, I changed it to vector<difference_type> borders(num_threads+2) and used &(*borders.begin()) or something wherever the array is expected would it cause big issues? Cheers, Chris |
|
|
Re: [v3] parallel mode> Quick question: > > -pedantic (well, -pedantic-errors actually) give's: > include/c++/4.3.0/parallel/partial_sum.h:107: error: ISO C++ forbids > variable length array ‘borders’ > > the offending line being: > difference_type borders[num_threads + 2]; Yeah. I saw a VLA warning with pedantic, and marked it with XXX VLA. Fixing this would be great! > Is this necessary? What side-effects would a vector cause? If, as a > hack, I changed it to vector<difference_type> borders(num_threads+2) > and used &(*borders.begin()) or something wherever the array is > expected would it cause big issues? I think it would probably work. However, we're trying to keep intra-dependencies minimal, so maybe using vector is not such a great idea. Hmmm. The other approach would be to use dynamic memory allocation. -benjamin |
|
|
Re: [v3] parallel modeOn Mon, 10 Sep 2007, Benjamin Kosnik wrote:
> I'm delighted to announce a parallel implementation of many C++ > Standard library algorithms, which been integrated into libstdc++ as > an experimental mode that will henceforth be known as the libstdc++ > parallel mode. These parallel algorithms use OpenMP as a thread layer, > and should thus be portable to hardware/os combinations that support > libgomp. Cool stuff! And definitely something for the News section on our main page and gcc-4.3/changes.html on the same. ;-) Would you mind giving both a try? Gerald |
| Free embeddable forum powered by Nabble | Forum Help |