|
View:
New views
5 Messages
—
Rating Filter:
Alert me
|
|
|
Bug#554689: /usr/bin/dpkg-source: is slowPackage: dpkg-dev
Version: 1.15.4.1 Severity: normal File: /usr/bin/dpkg-source Tags: patch As seen in (but unrelated to) bug #554612, running dpkg-source -b is slow on big trees, where the main contender is Dpkg::Source::Patch::add_diff_file. %Time Sec. #calls sec/call F name 32.52 69.8110 44702 0.001562 Dpkg::Source::Patch::add_diff_file 29.95 64.2837 44829 0.001434 Dpkg::IPC::fork_and_exec 10.22 21.9370 44829 0.000489 Dpkg::IPC::wait_child 7.14 15.3309 97591 0.000157 File::Spec::Unix::abs2rel 4.25 9.1206 585681 0.000016 File::Spec::Unix::canonpath Here, the main problem is obviously forking 44829 times diff -u, while the vast majority of files in the orig tarball haven't been touched (which is mostly true on all packages). The attached patch (which may have style and correctness issue) implements a very simple check in perl (so, without a fork) to see if files differ before running diff. The result is stunning: >From 3 minutes and 30 seconds on iceape in format 3.0 (quilt), dpkg-source -b goes down to 35 seconds (where 15 are spent bunzipping). This is where the time is spent, now: %Time Sec. #calls sec/call F name 24.41 14.1649 128 0.110663 Dpkg::IPC::wait_child 19.46 11.2948 97594 0.000116 File::Spec::Unix::abs2rel 13.77 7.9901 44703 0.000179 Dpkg::Source::Patch::add_diff_file 9.37 5.4382 585699 0.000009 File::Spec::Unix::canonpath It looks much better. My gut feeling is that it should improve run time on most if not all packages. (and more importantly, will help big packages maintainers) I'm pretty sure reading by blocks of a multiple of 4k instead of reading line by line could be faster, too. -- System Information: Debian Release: squeeze/sid APT prefers unstable APT policy: (500, 'unstable'), (1, 'experimental') Architecture: amd64 (x86_64) Kernel: Linux 2.6.31-1-amd64 (SMP w/2 CPU cores) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/bash Versions of packages dpkg-dev depends on: ii binutils 2.20-2 The GNU assembler, linker and bina ii bzip2 1.0.5-3 high-quality block-sorting file co ii dpkg 1.15.4.1 Debian package management system ii libtimedate-perl 1.1900-1 Time and date functions for Perl ii lzma 4.43-14 Compression method of 7z format in ii make 3.81-7 An utility for Directing compilati ii patch 2.5.9-5 Apply a diff file to an original ii perl [perl5] 5.10.1-6 Larry Wall's Practical Extraction ii perl-modules 5.10.1-6 Core Perl modules Versions of packages dpkg-dev recommends: ii build-essential 11.4 Informational list of build-essent ii fakeroot 1.14.3 Gives a fake root environment ii gcc [c-compiler] 4:4.3.3-9+nmu1 The GNU C compiler ii gcc-4.1 [c-compiler] 4.1.2-27 The GNU C compiler ii gcc-4.2 [c-compiler] 4.2.4-6 The GNU C compiler ii gcc-4.3 [c-compiler] 4.3.4-6 The GNU C compiler ii gcc-4.4 [c-compiler] 4.4.2-2 The GNU C compiler ii gnupg 1.4.10-2 GNU privacy guard - a free PGP rep ii gpgv 1.4.10-2 GNU privacy guard - signature veri Versions of packages dpkg-dev suggests: ii debian-keyring [debian-mainta 2009.08.27 GnuPG (and obsolete PGP) keys of D -- no debconf information -- debsums errors found: debsums: changed file /usr/share/perl5/Dpkg/Source/Patch.pm (from dpkg-dev package) debsums: changed file /usr/share/perl5/Dpkg/Source/Package/V2.pm (from dpkg-dev package) --- /usr/share/perl5/Dpkg/Source/Patch.pm +++ /usr/share/perl5/Dpkg/Source/Patch.pm @@ -58,6 +58,19 @@ sub add_diff_file { my ($self, $old, $new, %opts) = @_; + open(OLD, "<", $old); + open(NEW, "<", $new); + my $match = 1; + while (<OLD>) { + if ($_ ne <NEW>) { + $match = 0; + last; + } + } + close OLD; + close NEW; + return 1 if ($match); + $opts{"include_timestamp"} = 0 unless exists $opts{"include_timestamp"}; my $handle_binary = $opts{"handle_binary_func"} || sub { my ($self, $old, $new) = @_; |
|
|
Bug#554689: /usr/bin/dpkg-source: is slowOn Fri, 06 Nov 2009, Mike Hommey wrote:
> The attached patch (which may have style and correctness issue) implements > a very simple check in perl (so, without a fork) to see if files differ > before running diff. The result is stunning: Thanks for the good idea. The change I made is the following: --- a/scripts/Dpkg/Source/Patch.pm +++ b/scripts/Dpkg/Source/Patch.pm @@ -32,6 +32,7 @@ use File::Find; use File::Basename; use File::Spec; use File::Path; +use File::Compare; use Fcntl ':mode'; #XXX: Needed for sub-second timestamps, require recent perl #use Time::HiRes qw(stat); @@ -63,6 +64,8 @@ sub add_diff_file { my ($self, $old, $new) = @_; $self->_fail_with_msg($new, _g("binary file contents changed")); }; + # Optimization to avoid forking diff if unnecessary + return 1 if compare($old, $new, 4096) == 0; # Default diff options my @options; if ($opts{"options"}) { Building iceape, before: real 1m11.078s user 0m22.501s sys 0m36.946s After: real 0m14.949s user 0m9.377s sys 0m2.948s Cheers, -- Raphaël Hertzog -- To UNSUBSCRIBE, email to debian-dpkg-bugs-REQUEST@... with a subject of "unsubscribe". Trouble? Contact listmaster@... |
|
|
Bug#554689: /usr/bin/dpkg-source: is slowOn Fri, Nov 06, 2009 at 08:45:57PM +0100, Raphael Hertzog wrote:
> Building iceape, before: > real 1m11.078s > user 0m22.501s > sys 0m36.946s > > After: > real 0m14.949s > user 0m9.377s > sys 0m2.948s We can go even further, with another optimization. As appeared in the profiles, abs2rel and canonpath take a lot of time. It look like abs2rel is bloated for what it is used for. Replacing: my $fn = File::Spec->abs2rel($_, $new); with my $fn = length($_) > length($new) + 1 ? substr($_, length($new) + 1) : '.'; and my $fn = File::Spec->abs2rel($_, $old); with my $fn = length($_) > length($old) + 1 ? substr($_, length($old) + 1) : '.'; in add_diff_directory, make another 25% improvement. I'm now down to 25s on iceape-2.0-1 in 3.0 (quilt) format with a bzip2 tarball, which is much closer to bunzip2 (15s) + diff (2s), and down to 7s on the iceape-1.1.17-2 source (where it took 11s before, compared to your 15s). The canonpath calls disappear from the profile once the abs2rel calls are modified as stated above. Cheers, Mike -- To UNSUBSCRIBE, email to debian-dpkg-bugs-REQUEST@... with a subject of "unsubscribe". Trouble? Contact listmaster@... |
|
|
Bug#554689: /usr/bin/dpkg-source: is slowOn Sat, 07 Nov 2009, Mike Hommey wrote:
> We can go even further, with another optimization. > > As appeared in the profiles, abs2rel and canonpath take a lot of time. > It look like abs2rel is bloated for what it is used for. I applied the patch as well. Thanks! Cheers, -- Raphaël Hertzog -- To UNSUBSCRIBE, email to debian-dpkg-bugs-REQUEST@... with a subject of "unsubscribe". Trouble? Contact listmaster@... |
|
|
Bug#554689: marked as done (/usr/bin/dpkg-source: is slow)Your message dated Tue, 17 Nov 2009 12:47:51 +0000
with message-id <E1NANTL-0005Ne-9Z@...> and subject line Bug#554689: fixed in dpkg 1.15.5 has caused the Debian Bug report #554689, regarding /usr/bin/dpkg-source: is slow to be marked as done. This means that you claim that the problem has been dealt with. If this is not the case it is now your responsibility to reopen the Bug report if necessary, and/or fix the problem forthwith. (NB: If you are a system administrator and have no idea what this message is talking about, this may indicate a serious mail system misconfiguration somewhere. Please contact owner@... immediately.) -- 554689: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=554689 Debian Bug Tracking System Contact owner@... with problems Package: dpkg-dev Version: 1.15.4.1 Severity: normal File: /usr/bin/dpkg-source Tags: patch As seen in (but unrelated to) bug #554612, running dpkg-source -b is slow on big trees, where the main contender is Dpkg::Source::Patch::add_diff_file. %Time Sec. #calls sec/call F name 32.52 69.8110 44702 0.001562 Dpkg::Source::Patch::add_diff_file 29.95 64.2837 44829 0.001434 Dpkg::IPC::fork_and_exec 10.22 21.9370 44829 0.000489 Dpkg::IPC::wait_child 7.14 15.3309 97591 0.000157 File::Spec::Unix::abs2rel 4.25 9.1206 585681 0.000016 File::Spec::Unix::canonpath Here, the main problem is obviously forking 44829 times diff -u, while the vast majority of files in the orig tarball haven't been touched (which is mostly true on all packages). The attached patch (which may have style and correctness issue) implements a very simple check in perl (so, without a fork) to see if files differ before running diff. The result is stunning: >From 3 minutes and 30 seconds on iceape in format 3.0 (quilt), dpkg-source -b goes down to 35 seconds (where 15 are spent bunzipping). This is where the time is spent, now: %Time Sec. #calls sec/call F name 24.41 14.1649 128 0.110663 Dpkg::IPC::wait_child 19.46 11.2948 97594 0.000116 File::Spec::Unix::abs2rel 13.77 7.9901 44703 0.000179 Dpkg::Source::Patch::add_diff_file 9.37 5.4382 585699 0.000009 File::Spec::Unix::canonpath It looks much better. My gut feeling is that it should improve run time on most if not all packages. (and more importantly, will help big packages maintainers) I'm pretty sure reading by blocks of a multiple of 4k instead of reading line by line could be faster, too. -- System Information: Debian Release: squeeze/sid APT prefers unstable APT policy: (500, 'unstable'), (1, 'experimental') Architecture: amd64 (x86_64) Kernel: Linux 2.6.31-1-amd64 (SMP w/2 CPU cores) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/bash Versions of packages dpkg-dev depends on: ii binutils 2.20-2 The GNU assembler, linker and bina ii bzip2 1.0.5-3 high-quality block-sorting file co ii dpkg 1.15.4.1 Debian package management system ii libtimedate-perl 1.1900-1 Time and date functions for Perl ii lzma 4.43-14 Compression method of 7z format in ii make 3.81-7 An utility for Directing compilati ii patch 2.5.9-5 Apply a diff file to an original ii perl [perl5] 5.10.1-6 Larry Wall's Practical Extraction ii perl-modules 5.10.1-6 Core Perl modules Versions of packages dpkg-dev recommends: ii build-essential 11.4 Informational list of build-essent ii fakeroot 1.14.3 Gives a fake root environment ii gcc [c-compiler] 4:4.3.3-9+nmu1 The GNU C compiler ii gcc-4.1 [c-compiler] 4.1.2-27 The GNU C compiler ii gcc-4.2 [c-compiler] 4.2.4-6 The GNU C compiler ii gcc-4.3 [c-compiler] 4.3.4-6 The GNU C compiler ii gcc-4.4 [c-compiler] 4.4.2-2 The GNU C compiler ii gnupg 1.4.10-2 GNU privacy guard - a free PGP rep ii gpgv 1.4.10-2 GNU privacy guard - signature veri Versions of packages dpkg-dev suggests: ii debian-keyring [debian-mainta 2009.08.27 GnuPG (and obsolete PGP) keys of D -- no debconf information -- debsums errors found: debsums: changed file /usr/share/perl5/Dpkg/Source/Patch.pm (from dpkg-dev package) debsums: changed file /usr/share/perl5/Dpkg/Source/Package/V2.pm (from dpkg-dev package) --- /usr/share/perl5/Dpkg/Source/Patch.pm +++ /usr/share/perl5/Dpkg/Source/Patch.pm @@ -58,6 +58,19 @@ sub add_diff_file { my ($self, $old, $new, %opts) = @_; + open(OLD, "<", $old); + open(NEW, "<", $new); + my $match = 1; + while (<OLD>) { + if ($_ ne <NEW>) { + $match = 0; + last; + } + } + close OLD; + close NEW; + return 1 if ($match); + $opts{"include_timestamp"} = 0 unless exists $opts{"include_timestamp"}; my $handle_binary = $opts{"handle_binary_func"} || sub { my ($self, $old, $new) = @_; Source: dpkg Source-Version: 1.15.5 We believe that the bug you reported is fixed in the latest version of dpkg, which is due to be installed in the Debian FTP archive: dpkg-dev_1.15.5_all.deb to main/d/dpkg/dpkg-dev_1.15.5_all.deb dpkg_1.15.5.dsc to main/d/dpkg/dpkg_1.15.5.dsc dpkg_1.15.5.tar.bz2 to main/d/dpkg/dpkg_1.15.5.tar.bz2 dpkg_1.15.5_amd64.deb to main/d/dpkg/dpkg_1.15.5_amd64.deb dselect_1.15.5_amd64.deb to main/d/dpkg/dselect_1.15.5_amd64.deb A summary of the changes between this version and the previous one is attached. Thank you for reporting the bug, which will now be closed. If you have further comments please address them to 554689@..., and the maintainer will reopen the bug report if appropriate. Debian distribution maintenance software pp. Guillem Jover <guillem@...> (supplier of updated dpkg package) (This message was generated automatically at their request; if you believe that there is a problem with it please contact the archive administrators by mailing ftpmaster@...) -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Format: 1.8 Date: Tue, 17 Nov 2009 10:17:57 +0100 Source: dpkg Binary: dpkg dpkg-dev dselect Architecture: source amd64 all Version: 1.15.5 Distribution: unstable Urgency: low Maintainer: Dpkg Developers <debian-dpkg@...> Changed-By: Guillem Jover <guillem@...> Description: dpkg - Debian package management system dpkg-dev - Debian package development tools dselect - Debian package management front-end Closes: 131633 402527 421367 453005 454628 482166 494136 516631 530070 536066 537338 543581 545274 545446 548541 548615 551829 553328 553580 554612 554689 555806 Changes: dpkg (1.15.5) unstable; urgency=low . [ Guillem Jover ] * Remove obsolete conffiles on purge. Closes: #421367, #453005, #454628 * Update list of binaries dpkg checks on the PATH. - Remove install-info, now a wrapper that will disappear soonish. - Add programs used by dpkg itself: sh, rm, find, tar and dpkg-deb. * Check and warn on duplicate conffiles in dpkg-deb. Closes: #131633 * Make the upstream build system silent by default with automake 1.11 or newer, and always verbose when building the Debian packages. * Fix small leak when parsing ‘--ignore-depends’ option values. * Define compatibility WCOREDUMP only if the system does not have it. * When start-stop-daemon fails to set the io scheduling warn instead of finishing fatally. Closes: #553580 * Update md5 file paths in debian/copyright. Thanks to Jonathan Nieder <jrnieder@...>. * On ‘dpkg-trigger --help’ print the default admindir instead of the one passed on the command line. * Abort on configure if the required C99 extensions are not supported. * Add C coding style document. * Make dpkg as strict as dpkg-statoverride on input when validating the parsed data from the statdb. * Rewrite dpkg-statoverride in C. * Use C99 snprintf function family semantics to avoid having to call them in a loop to grow the varbuf buffer. This should reduce memory usage and be slightly faster on varbufprintf calls. * Use the size from stat to allocate the buffers for readlink, instead of indefinitely calling readlink and growing the buffer. This should reduce memory usage when handling lots of symlinks, and be slightly faster. * Rework varbuf api to avoid increasing buffers indefinitely when adding content to them, regardless of space being already available. * Fix build macros to allow start-stop-deaemon to use TIOCNOTTY. * Generate the autoconf version from git to make it easier to see when a snapshot version is being used. * Add infrastructure for doxygen, for now not installed anywhere. * Allow overriding the pkg-config path to ease cross-compilation. Suggested by Tollef Fog Heen <tfheen@...>. * Fix spelling errors in the Catalan translation. Closes: #553328 Thanks to Robert Millan. * Update the FSF postal address in the source code license headers by replacing it with a URL to the gnu.org page. * Fix a file descriptor leak in dpkg-deb. Reported by Raphael Geissert <atomo64@...>. * Fix resource leaks on error conditions in compat scandir. * Add a new status-fd action when disappearing a package. Closes: #537338 . [ Raphaël Hertzog ] * Add versioned dependency on base-files (>= 5.0.0) to dpkg-dev to ensure that /etc/dpkg/origins/default exists. Closes: #545274 * Update Standards-Version to 3.8.3 (no changes needed). * Major changes to the perl API: - Dpkg::Control is now Dpkg::Control::Info - Dpkg::Cdata is gone and is replaced by a new Dpkg::Control - Dpkg::Control::Fields contains authoritative information about fields allowed in various types of control information (and can be customized by each vendor). It also integrates information that was previously available through Dpkg::Deps. - Dpkg::Changelog has been split in multiple modules and largely modified to offer an interface that is now more in line with the other modules. * All dpkg-* perl programs that work with control information have been updated to use the new Dpkg::Control interface. In this process, dpkg-scanpackages has been fixed to not skip non-standard fields. Closes: #494136 * Create Launchpad-Bugs-Fixed directly in the changelog parsing code thanks to a new vendor hook post-process-changelog-entry. Closes: #536066 * Integrate dpkg-ftp into dselect. Add the required Replaces and Conflicts. * dpkg-scanpackages/dpkg-scansources now supports compressed override files. * dpkg-scanpackages now supports a new --medium option as needed to generate Packages.cd file for consumption by the multicd dselect access method. Closes: #402527 * Integrate dpkg-multicd into dselect. Add the required Replaces and Conflicts. The dpkg-scanpackages fork is dropped. Closes: #516631 * Fix bashisms in dselect multicd access method. Closes: #530070 * Add support of "xz" compression method for source packages. Add dependency dpkg-dev → xz-utils to ensure xz and unxz are available. * Fix dpkg-source --include-binaries to correctly compute the path name of the discovered binary files. Closes: #554612 * Remove extra quoting that should not be there while passing an exclude file to git ls-files during build of 3.0 (git) source package. Thanks to Courtney Bane for the patch. Closes: #551829 * Optimize dpkg-source -b by avoiding many diff calls when not required. Thanks to Mike Hommey for the idea. Closes: #554689 * Add new option --print-format to dpkg-source to be able to know by advance the source format that would be used during a build. * Modify dpkg-source -b to use default build options from debian/source/options. Thus it's now possible to have sticky options, for example for the choice of a compression method (--compression=<comp>). * dpkg-source outputs the list of upstream files modified by the diff.gz (applies only to source packages using format 1.0). Closes: #482166 It also recommends usage of 3.0 (quilt) format during dpkg-source -b if it detects changes to upstream files that are stored in the .diff.gz. * Add DEP-3 compliant headers to automatic patches created by dpkg-source in 3.0 (quilt) source format. Closes: #543581 * Switch dpkg to source format "3.0 (native)" with bzip2 compression. . [ Updated dpkg translations ] * Czech (Miroslav Kure). * French (Christian Perrier). * German (Sven Joachim). * Italian (Milo Casagrande). Closes: #548615, #555806 * Polish (Wiktor Wandachowicz). Closes: #548541 * Swedish (Peter Krefting). . [ Updated dselect translations ] * Czech (Miroslav Kure). * French (Christian Perrier). * German (Sven Joachim). * Polish (Wiktor Wandachowicz). Closes: #548541 * Swedish (Peter Krefting). . [ Updated man page translations ] * French (Christian Perrier). * French translation error fixed (Christian Perrier) Thanks to Pietro Battiston for spotting it. Closes: #545446 * German (Helge Kreutzmann). * Polish (Wiktor Wandachowicz). Closes: #548541 * Swedish (Peter Krefting). . [ Updated scripts translations ] * German (Helge Kreutzmann). * Polish (Wiktor Wandachowicz). Closes: #548541 * Swedish (Peter Krefting). Checksums-Sha1: 8a4d00e01d47b47fb22feebfecd6ebb45d33e4a0 1185 dpkg_1.15.5.dsc f80768b09bad7c79d57ea247fa30d9e2cf8c8418 4656922 dpkg_1.15.5.tar.bz2 4185bb1f2cefa0f04406c20ce68eccb971e0a828 2218876 dpkg_1.15.5_amd64.deb 18c93cc9f6ad5e59e86b3dea27f23478792f7b77 770950 dselect_1.15.5_amd64.deb 5a34568c79d6030aa85608821d2c9dd2905d878a 753660 dpkg-dev_1.15.5_all.deb Checksums-Sha256: a79a51695a4d270998e8f4c1db3f217c3526bf089d8c6956a704a52f521f6a6c 1185 dpkg_1.15.5.dsc 58936b5ce1e0155ecbb67845f7e4de24a1bebcd8bedd714020fc4bc13e726c43 4656922 dpkg_1.15.5.tar.bz2 0e6d9d71df3ba84eb991e6209925f165237bdf961cd10007d9701f26a7e42cb4 2218876 dpkg_1.15.5_amd64.deb ed22ab5726956f16c36fa99597b3e2466cec76a035dc59121980c76d70898181 770950 dselect_1.15.5_amd64.deb b766320fbfa14ae0282ecda0fe5fd46ed180098559825afea741a5c507ae83d4 753660 dpkg-dev_1.15.5_all.deb Files: b571ce1bb1754a0eecc942edf60808a3 1185 admin required dpkg_1.15.5.dsc 4ae70d3ffdddf2d1eaf685a96b95396f 4656922 admin required dpkg_1.15.5.tar.bz2 ef44e1582d655b71d39f8ce707df41c0 2218876 admin required dpkg_1.15.5_amd64.deb 77a68e68c7ae8e0f65c867befed29a26 770950 admin optional dselect_1.15.5_amd64.deb a57832d125bb04f219e0337ab2ef20f6 753660 utils optional dpkg-dev_1.15.5_all.deb -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) iEYEARECAAYFAksCbOYACgkQuW9ciZ2SjJv6lwCcCc+3GAFly26wrApq1ZhaOuaQ RPYAoPhsQIpDdTn5vnOewPrCBj7SfgpQ =fZaU -----END PGP SIGNATURE----- |
| Free embeddable forum powered by Nabble | Forum Help |