|
View:
New views
4 Messages
—
Rating Filter:
Alert me
|
|
|
Bug#554689: /usr/bin/dpkg-source: is slowPackage: dpkg-dev
Version: 1.15.4.1 Severity: normal File: /usr/bin/dpkg-source Tags: patch As seen in (but unrelated to) bug #554612, running dpkg-source -b is slow on big trees, where the main contender is Dpkg::Source::Patch::add_diff_file. %Time Sec. #calls sec/call F name 32.52 69.8110 44702 0.001562 Dpkg::Source::Patch::add_diff_file 29.95 64.2837 44829 0.001434 Dpkg::IPC::fork_and_exec 10.22 21.9370 44829 0.000489 Dpkg::IPC::wait_child 7.14 15.3309 97591 0.000157 File::Spec::Unix::abs2rel 4.25 9.1206 585681 0.000016 File::Spec::Unix::canonpath Here, the main problem is obviously forking 44829 times diff -u, while the vast majority of files in the orig tarball haven't been touched (which is mostly true on all packages). The attached patch (which may have style and correctness issue) implements a very simple check in perl (so, without a fork) to see if files differ before running diff. The result is stunning: >From 3 minutes and 30 seconds on iceape in format 3.0 (quilt), dpkg-source -b goes down to 35 seconds (where 15 are spent bunzipping). This is where the time is spent, now: %Time Sec. #calls sec/call F name 24.41 14.1649 128 0.110663 Dpkg::IPC::wait_child 19.46 11.2948 97594 0.000116 File::Spec::Unix::abs2rel 13.77 7.9901 44703 0.000179 Dpkg::Source::Patch::add_diff_file 9.37 5.4382 585699 0.000009 File::Spec::Unix::canonpath It looks much better. My gut feeling is that it should improve run time on most if not all packages. (and more importantly, will help big packages maintainers) I'm pretty sure reading by blocks of a multiple of 4k instead of reading line by line could be faster, too. -- System Information: Debian Release: squeeze/sid APT prefers unstable APT policy: (500, 'unstable'), (1, 'experimental') Architecture: amd64 (x86_64) Kernel: Linux 2.6.31-1-amd64 (SMP w/2 CPU cores) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/bash Versions of packages dpkg-dev depends on: ii binutils 2.20-2 The GNU assembler, linker and bina ii bzip2 1.0.5-3 high-quality block-sorting file co ii dpkg 1.15.4.1 Debian package management system ii libtimedate-perl 1.1900-1 Time and date functions for Perl ii lzma 4.43-14 Compression method of 7z format in ii make 3.81-7 An utility for Directing compilati ii patch 2.5.9-5 Apply a diff file to an original ii perl [perl5] 5.10.1-6 Larry Wall's Practical Extraction ii perl-modules 5.10.1-6 Core Perl modules Versions of packages dpkg-dev recommends: ii build-essential 11.4 Informational list of build-essent ii fakeroot 1.14.3 Gives a fake root environment ii gcc [c-compiler] 4:4.3.3-9+nmu1 The GNU C compiler ii gcc-4.1 [c-compiler] 4.1.2-27 The GNU C compiler ii gcc-4.2 [c-compiler] 4.2.4-6 The GNU C compiler ii gcc-4.3 [c-compiler] 4.3.4-6 The GNU C compiler ii gcc-4.4 [c-compiler] 4.4.2-2 The GNU C compiler ii gnupg 1.4.10-2 GNU privacy guard - a free PGP rep ii gpgv 1.4.10-2 GNU privacy guard - signature veri Versions of packages dpkg-dev suggests: ii debian-keyring [debian-mainta 2009.08.27 GnuPG (and obsolete PGP) keys of D -- no debconf information -- debsums errors found: debsums: changed file /usr/share/perl5/Dpkg/Source/Patch.pm (from dpkg-dev package) debsums: changed file /usr/share/perl5/Dpkg/Source/Package/V2.pm (from dpkg-dev package) --- /usr/share/perl5/Dpkg/Source/Patch.pm +++ /usr/share/perl5/Dpkg/Source/Patch.pm @@ -58,6 +58,19 @@ sub add_diff_file { my ($self, $old, $new, %opts) = @_; + open(OLD, "<", $old); + open(NEW, "<", $new); + my $match = 1; + while (<OLD>) { + if ($_ ne <NEW>) { + $match = 0; + last; + } + } + close OLD; + close NEW; + return 1 if ($match); + $opts{"include_timestamp"} = 0 unless exists $opts{"include_timestamp"}; my $handle_binary = $opts{"handle_binary_func"} || sub { my ($self, $old, $new) = @_; |
|
|
Bug#554689: /usr/bin/dpkg-source: is slowOn Fri, 06 Nov 2009, Mike Hommey wrote:
> The attached patch (which may have style and correctness issue) implements > a very simple check in perl (so, without a fork) to see if files differ > before running diff. The result is stunning: Thanks for the good idea. The change I made is the following: --- a/scripts/Dpkg/Source/Patch.pm +++ b/scripts/Dpkg/Source/Patch.pm @@ -32,6 +32,7 @@ use File::Find; use File::Basename; use File::Spec; use File::Path; +use File::Compare; use Fcntl ':mode'; #XXX: Needed for sub-second timestamps, require recent perl #use Time::HiRes qw(stat); @@ -63,6 +64,8 @@ sub add_diff_file { my ($self, $old, $new) = @_; $self->_fail_with_msg($new, _g("binary file contents changed")); }; + # Optimization to avoid forking diff if unnecessary + return 1 if compare($old, $new, 4096) == 0; # Default diff options my @options; if ($opts{"options"}) { Building iceape, before: real 1m11.078s user 0m22.501s sys 0m36.946s After: real 0m14.949s user 0m9.377s sys 0m2.948s Cheers, -- Raphaël Hertzog -- To UNSUBSCRIBE, email to debian-bugs-dist-REQUEST@... with a subject of "unsubscribe". Trouble? Contact listmaster@... |
|
|
Bug#554689: /usr/bin/dpkg-source: is slowOn Fri, Nov 06, 2009 at 08:45:57PM +0100, Raphael Hertzog wrote:
> Building iceape, before: > real 1m11.078s > user 0m22.501s > sys 0m36.946s > > After: > real 0m14.949s > user 0m9.377s > sys 0m2.948s We can go even further, with another optimization. As appeared in the profiles, abs2rel and canonpath take a lot of time. It look like abs2rel is bloated for what it is used for. Replacing: my $fn = File::Spec->abs2rel($_, $new); with my $fn = length($_) > length($new) + 1 ? substr($_, length($new) + 1) : '.'; and my $fn = File::Spec->abs2rel($_, $old); with my $fn = length($_) > length($old) + 1 ? substr($_, length($old) + 1) : '.'; in add_diff_directory, make another 25% improvement. I'm now down to 25s on iceape-2.0-1 in 3.0 (quilt) format with a bzip2 tarball, which is much closer to bunzip2 (15s) + diff (2s), and down to 7s on the iceape-1.1.17-2 source (where it took 11s before, compared to your 15s). The canonpath calls disappear from the profile once the abs2rel calls are modified as stated above. Cheers, Mike -- To UNSUBSCRIBE, email to debian-bugs-dist-REQUEST@... with a subject of "unsubscribe". Trouble? Contact listmaster@... |
|
|
Bug#554689: /usr/bin/dpkg-source: is slowOn Sat, 07 Nov 2009, Mike Hommey wrote:
> We can go even further, with another optimization. > > As appeared in the profiles, abs2rel and canonpath take a lot of time. > It look like abs2rel is bloated for what it is used for. I applied the patch as well. Thanks! Cheers, -- Raphaël Hertzog -- To UNSUBSCRIBE, email to debian-bugs-dist-REQUEST@... with a subject of "unsubscribe". Trouble? Contact listmaster@... |
| Free embeddable forum powered by Nabble | Forum Help |