request: speedup mercurial

View: New views
14 Messages — Rating Filter:   Alert me  

request: speedup mercurial

by Christoph Egger :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Hi!

I have a 700MB large mercurial repository and I am using
mercurial 1.3.1.

mercurial is very slow on this. Every command touching
the working directory takes many seconds to complete.
The number of actually modified files doesn't seem to have
an impact on the overall mercurial speed.


A switch to a branch:

hg --time --profile update driver_acpidisplay
6 files updated, 0 files merged, 2 files removed, 0 files unresolved
   CallCount    Recursive    Total(ms)   Inline(ms) module:lineno(function)
           1            0     10.6863     10.3674
mercurial.dirstate:411(walk)
     +121957            0      0.0421      0.0421
+mercurial.dirstate:460(<lambda>)
     +101912            0      0.0394      0.0394
+mercurial.match:73(<lambda>)
      +20056            0      0.1811      0.0340
+mercurial.match:78(__call__)
      +10023            0      0.0185      0.0185
+mercurial.dirstate:117(_join)
      +10022            0      0.0140      0.0140   +<method 'pop' of
'list' objects>
           1            0      2.6645      0.8087
mercurial.merge:120(manifestmerge)
     +101897            0      0.7982      0.3399
+mercurial.merge:128(fmerge)
     +101897            0      0.1609      0.1251
+mercurial.manifest:18(flags)
     +101897            0      0.0439      0.0439   +<method 'get' of
'dict' objects>
          +8            0      0.0003      0.0002
+mercurial.merge:145(act)
          +3            0      0.0001      0.0000
+mercurial.i18n:25(gettext)
           1            0     11.4156      0.5063
mercurial.dirstate:554(status)
          +1            0     10.6863     10.3674
+mercurial.dirstate:411(walk)
          +1            0      0.2231      0.0000
+mercurial.util:149(__get__)
          +1            0      0.0000      0.0000   +<method 'iteritems'
of 'dict' objects>
      407595            0      0.6193      0.4766
mercurial.manifest:18(flags)
     +407595            0      0.1427      0.1427   +<method 'get' of
'dict' objects>
           1            0      1.0023      0.4396
mercurial.dirstate:371(write)
     +101900            0      0.2022      0.2022   +struct:54(pack)
     +203801            0      0.1396      0.1396   +<method 'write' of
'cStringIO.StringO' objects>
          +1            0      0.1378      0.1378   +<method 'write' of
'file' objects>
     +101900            0      0.0312      0.0312   +<len>
          +1            0      0.0149      0.0149   +<method 'getvalue'
of 'cStringIO.StringO' objects>
          31            0      0.4127      0.4127   <zlib.decompress>
          56            0      0.3802      0.3802   <method 'update' of
'_hashlib.HASH' objects>
           2            0      0.3706      0.3706
<mercurial.parsers.parse_manifest>
      101897            0      0.7982      0.3399
mercurial.merge:128(fmerge)
     +305691            0      0.4584      0.3514
+mercurial.manifest:18(flags)
         170            0      0.2403      0.2403   <method 'split' of
'str' objects>
Time: real 16.870 secs (user 6.870+0.000 sys 8.110+0.000)
_______________________________________________
Mercurial mailing list
Mercurial@...
http://selenic.com/mailman/listinfo/mercurial

Re: request: speedup mercurial

by Christoph Egger :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Christoph Egger wrote:

> Hi!
>
> I have a 700MB large mercurial repository and I am using
> mercurial 1.3.1.
>
> mercurial is very slow on this. Every command touching
> the working directory takes many seconds to complete.
> The number of actually modified files doesn't seem to have
> an impact on the overall mercurial speed.
>
>
> A switch to a branch:
>
> hg --time --profile update driver_acpidisplay
> 6 files updated, 0 files merged, 2 files removed, 0 files unresolved
>    CallCount    Recursive    Total(ms)   Inline(ms) module:lineno(function)
>            1            0     10.6863     10.3674
> mercurial.dirstate:411(walk)
>      +121957            0      0.0421      0.0421
> +mercurial.dirstate:460(<lambda>)
>      +101912            0      0.0394      0.0394
> +mercurial.match:73(<lambda>)
>       +20056            0      0.1811      0.0340
> +mercurial.match:78(__call__)
>       +10023            0      0.0185      0.0185
> +mercurial.dirstate:117(_join)
>       +10022            0      0.0140      0.0140   +<method 'pop' of
> 'list' objects>
>            1            0      2.6645      0.8087
> mercurial.merge:120(manifestmerge)
>      +101897            0      0.7982      0.3399
> +mercurial.merge:128(fmerge)
>      +101897            0      0.1609      0.1251
> +mercurial.manifest:18(flags)
>      +101897            0      0.0439      0.0439   +<method 'get' of
> 'dict' objects>
>           +8            0      0.0003      0.0002
> +mercurial.merge:145(act)
>           +3            0      0.0001      0.0000
> +mercurial.i18n:25(gettext)
>            1            0     11.4156      0.5063
> mercurial.dirstate:554(status)
>           +1            0     10.6863     10.3674
> +mercurial.dirstate:411(walk)
>           +1            0      0.2231      0.0000
> +mercurial.util:149(__get__)
>           +1            0      0.0000      0.0000   +<method 'iteritems'
> of 'dict' objects>
>       407595            0      0.6193      0.4766
> mercurial.manifest:18(flags)
>      +407595            0      0.1427      0.1427   +<method 'get' of
> 'dict' objects>
>            1            0      1.0023      0.4396
> mercurial.dirstate:371(write)
>      +101900            0      0.2022      0.2022   +struct:54(pack)
>      +203801            0      0.1396      0.1396   +<method 'write' of
> 'cStringIO.StringO' objects>
>           +1            0      0.1378      0.1378   +<method 'write' of
> 'file' objects>
>      +101900            0      0.0312      0.0312   +<len>
>           +1            0      0.0149      0.0149   +<method 'getvalue'
> of 'cStringIO.StringO' objects>
>           31            0      0.4127      0.4127   <zlib.decompress>
>           56            0      0.3802      0.3802   <method 'update' of
> '_hashlib.HASH' objects>
>            2            0      0.3706      0.3706
> <mercurial.parsers.parse_manifest>
>       101897            0      0.7982      0.3399
> mercurial.merge:128(fmerge)
>      +305691            0      0.4584      0.3514
> +mercurial.manifest:18(flags)
>          170            0      0.2403      0.2403   <method 'split' of
> 'str' objects>
> Time: real 16.870 secs (user 6.870+0.000 sys 8.110+0.000)
>


Here the profiling and timing output of a commit of two files:

   CallCount    Recursive    Total(ms)   Inline(ms) module:lineno(function)
      101934            0     12.0914     12.0914   <posix.lstat>
           1            0      7.2575      6.2756
mercurial.store:245(_load)
     +102148            0      0.8374      0.4094
+mercurial.store:24(decodedir)
     +102148            0      0.1020      0.1020   +<method 'add' of
'set' objects>
     +102148            0      0.0420      0.0420   +<len>
          +1            0      0.0004      0.0003
+mercurial.util:856(__call__)
          +1            0      0.0000      0.0000   +<method 'close' of
'file' objects>
           1            0     14.9042      0.7933
mercurial.dirstate:554(status)
          +1            0     13.7192      0.5139
+mercurial.dirstate:411(walk)
          +1            0      0.3917      0.0000
+mercurial.util:149(__get__)
          +1            0      0.0000      0.0000   +<method 'iteritems'
of 'dict' objects>
          +2            0      0.0000      0.0000   +<method 'append' of
'list' objects>
          15            0      0.6774      0.6774   <method 'read' of
'file' objects>
           1            0     13.7192      0.5139
mercurial.dirstate:411(walk)
          +1            0     12.7891      0.4947   +<zip>
          +2            0      0.2321      0.2321   +<sorted>
     +101901            0      0.0956      0.0956
+mercurial.dirstate:117(_join)
     +101901            0      0.0451      0.0451   +stat:29(S_IFMT)
     +101900            0      0.0316      0.0316
+mercurial.match:73(<lambda>)
           1            0     12.7891      0.4947   <zip>
     +101900            0     12.2944      0.2046
+mercurial.posix:172(statfiles)
           7            0      0.4906      0.4906   <sorted>
           1            0      1.3772      0.4315
mercurial.manifest:106(add)
          +1            0      0.2585      0.2585   +<sorted>
     +101900            0      0.1667      0.1269
+mercurial.manifest:18(flags)
     +101900            0      0.0927      0.0927   +<binascii.hexlify>
          +1            0      0.0459      0.0459
+mercurial.manifest:125(checkforbidden)
          +1            0      0.0296      0.0296   +<method 'join' of
'str' objects>
           1            0      0.9280      0.4196
mercurial.dirstate:371(write)
     +101900            0      0.1912      0.1912   +struct:54(pack)
     +203801            0      0.1368      0.1368   +<method 'write' of
'cStringIO.StringO' objects>
          +1            0      0.0958      0.0958   +<method 'write' of
'file' objects>
     +101900            0      0.0302      0.0302   +<len>
          +1            0      0.0185      0.0185   +<method 'getvalue'
of 'cStringIO.StringO' objects>
      102148            0      0.8374      0.4094
mercurial.store:24(decodedir)
     +306444            0      0.3287      0.3287   +<method 'replace'
of 'str' objects>
     +102148            0      0.0993      0.0993   +<method
'startswith' of 'str' objects>
Time: real 27.070 secs (user 7.850+0.000 sys 12.140+0.000)


Christoph

_______________________________________________
Mercurial mailing list
Mercurial@...
http://selenic.com/mailman/listinfo/mercurial

Re: request: speedup mercurial

by Matt Mackall :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sat, 2009-11-07 at 10:35 +0100, Christoph Egger wrote:

> Christoph Egger wrote:
> > Hi!
> >
> > I have a 700MB large mercurial repository and I am using
> > mercurial 1.3.1.
> >
> > mercurial is very slow on this. Every command touching
> > the working directory takes many seconds to complete.
> > The number of actually modified files doesn't seem to have
> > an impact on the overall mercurial speed.
> >
> >
> > A switch to a branch:
> >
> > hg --time --profile update driver_acpidisplay
> > 6 files updated, 0 files merged, 2 files removed, 0 files unresolved
> >    CallCount    Recursive    Total(ms)   Inline(ms) module:lineno(function)
> >            1            0     10.6863     10.3674
> > mercurial.dirstate:411(walk)
...
> > Time: real 16.870 secs (user 6.870+0.000 sys 8.110+0.000)
> >

Almost all of that time is spent in the operating system checking the
modification time of files in the working directory, which is a heavily
optimized path already. Here I've got a repo with over 40000 files in
the working directory and checking status takes .86s [1]. I've got
another with 95000 on a remote machine where it takes 2.0s.

How many files are in your working directory? (hg st -marduic | wc)
How many of those files are tracked by hg? (hg st -mardc | wc)
What operating system, file system, CPU, and storage hardware are you
using?

> Here the profiling and timing output of a commit of two files:
>
>    CallCount    Recursive    Total(ms)   Inline(ms) module:lineno(function)
>       101934            0     12.0914     12.0914   <posix.lstat>
>            1            0      7.2575      6.2756
> mercurial.store:245(_load)
> Time: real 27.070 secs (user 7.850+0.000 sys 12.140+0.000)

Again, most of the time spent in the operating system checking file
status, with another 6 seconds simply reading a file with the list of
the names in your repo.

--
http://selenic.com : development and support for Mercurial and Linux


_______________________________________________
Mercurial mailing list
Mercurial@...
http://selenic.com/mailman/listinfo/mercurial

Re: request: speedup mercurial

by Christoph Egger :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Matt Mackall wrote:

> On Sat, 2009-11-07 at 10:35 +0100, Christoph Egger wrote:
>> Christoph Egger wrote:
>>> Hi!
>>>
>>> I have a 700MB large mercurial repository and I am using
>>> mercurial 1.3.1.
>>>
>>> mercurial is very slow on this. Every command touching
>>> the working directory takes many seconds to complete.
>>> The number of actually modified files doesn't seem to have
>>> an impact on the overall mercurial speed.
>>>
>>>
>>> A switch to a branch:
>>>
>>> hg --time --profile update driver_acpidisplay
>>> 6 files updated, 0 files merged, 2 files removed, 0 files unresolved
>>>    CallCount    Recursive    Total(ms)   Inline(ms) module:lineno(function)
>>>            1            0     10.6863     10.3674
>>> mercurial.dirstate:411(walk)
> ...
>>> Time: real 16.870 secs (user 6.870+0.000 sys 8.110+0.000)
>>>
>
> Almost all of that time is spent in the operating system checking the
> modification time of files in the working directory, which is a heavily
> optimized path already. Here I've got a repo with over 40000 files in
> the working directory and checking status takes .86s [1]. I've got
> another with 95000 on a remote machine where it takes 2.0s.
>
> How many files are in your working directory? (hg st -marduic | wc)

  301670  603340 14315588

> How many of those files are tracked by hg? (hg st -mardc | wc)

  101900  203800 4669033

> What operating system,

MacOSX 10.5

> file system,

hfs+

> CPU,

Powermac G5 2,3GHz Dual core

> and storage hardware are you using?

SATA-I disk.


>
>> Here the profiling and timing output of a commit of two files:
>>
>>    CallCount    Recursive    Total(ms)   Inline(ms) module:lineno(function)
>>       101934            0     12.0914     12.0914   <posix.lstat>
>>            1            0      7.2575      6.2756
>> mercurial.store:245(_load)
>> Time: real 27.070 secs (user 7.850+0.000 sys 12.140+0.000)
>
> Again, most of the time spent in the operating system checking file
> status, with another 6 seconds simply reading a file with the list of
> the names in your repo.
>

_______________________________________________
Mercurial mailing list
Mercurial@...
http://selenic.com/mailman/listinfo/mercurial

Re: request: speedup mercurial

by chadrik :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Is there a wiki page on best practices to maintain optimal repo  
performance? I would be very interested in this.

-chad

Sent from my iPhone

On Nov 7, 2009, at 9:58 AM, Christoph Egger <Christoph_Egger@...>  
wrote:

> Matt Mackall wrote:
>> On Sat, 2009-11-07 at 10:35 +0100, Christoph Egger wrote:
>>> Christoph Egger wrote:
>>>> Hi!
>>>>
>>>> I have a 700MB large mercurial repository and I am using
>>>> mercurial 1.3.1.
>>>>
>>>> mercurial is very slow on this. Every command touching
>>>> the working directory takes many seconds to complete.
>>>> The number of actually modified files doesn't seem to have
>>>> an impact on the overall mercurial speed.
>>>>
>>>>
>>>> A switch to a branch:
>>>>
>>>> hg --time --profile update driver_acpidisplay
>>>> 6 files updated, 0 files merged, 2 files removed, 0 files  
>>>> unresolved
>>>>   CallCount    Recursive    Total(ms)   Inline(ms) module:lineno
>>>> (function)
>>>>           1            0     10.6863     10.3674
>>>> mercurial.dirstate:411(walk)
>> ...
>>>> Time: real 16.870 secs (user 6.870+0.000 sys 8.110+0.000)
>>>>
>>
>> Almost all of that time is spent in the operating system checking the
>> modification time of files in the working directory, which is a  
>> heavily
>> optimized path already. Here I've got a repo with over 40000 files in
>> the working directory and checking status takes .86s [1]. I've got
>> another with 95000 on a remote machine where it takes 2.0s.
>>
>> How many files are in your working directory? (hg st -marduic | wc)
>
>  301670  603340 14315588
>
>> How many of those files are tracked by hg? (hg st -mardc | wc)
>
>  101900  203800 4669033
>
>> What operating system,
>
> MacOSX 10.5
>
>> file system,
>
> hfs+
>
>> CPU,
>
> Powermac G5 2,3GHz Dual core
>
>> and storage hardware are you using?
>
> SATA-I disk.
>
>
>>
>>> Here the profiling and timing output of a commit of two files:
>>>
>>>   CallCount    Recursive    Total(ms)   Inline(ms) module:lineno
>>> (function)
>>>      101934            0     12.0914     12.0914   <posix.lstat>
>>>           1            0      7.2575      6.2756
>>> mercurial.store:245(_load)
>>> Time: real 27.070 secs (user 7.850+0.000 sys 12.140+0.000)
>>
>> Again, most of the time spent in the operating system checking file
>> status, with another 6 seconds simply reading a file with the list of
>> the names in your repo.
>>
>
> _______________________________________________
> Mercurial mailing list
> Mercurial@...
> http://selenic.com/mailman/listinfo/mercurial
_______________________________________________
Mercurial mailing list
Mercurial@...
http://selenic.com/mailman/listinfo/mercurial

Re: request: speedup mercurial

by Matt Mackall :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sat, 2009-11-07 at 17:58 +0100, Christoph Egger wrote:

> Matt Mackall wrote:
> > On Sat, 2009-11-07 at 10:35 +0100, Christoph Egger wrote:
> >> Christoph Egger wrote:
> >>> Hi!
> >>>
> >>> I have a 700MB large mercurial repository and I am using
> >>> mercurial 1.3.1.
> >>>
> >>> mercurial is very slow on this. Every command touching
> >>> the working directory takes many seconds to complete.
> >>> The number of actually modified files doesn't seem to have
> >>> an impact on the overall mercurial speed.
> >>>
> >>>
> >>> A switch to a branch:
> >>>
> >>> hg --time --profile update driver_acpidisplay
> >>> 6 files updated, 0 files merged, 2 files removed, 0 files unresolved
> >>>    CallCount    Recursive    Total(ms)   Inline(ms) module:lineno(function)
> >>>            1            0     10.6863     10.3674
> >>> mercurial.dirstate:411(walk)
> > ...
> >>> Time: real 16.870 secs (user 6.870+0.000 sys 8.110+0.000)
> >>>
> >
> > Almost all of that time is spent in the operating system checking the
> > modification time of files in the working directory, which is a heavily
> > optimized path already. Here I've got a repo with over 40000 files in
> > the working directory and checking status takes .86s [1]. I've got
> > another with 95000 on a remote machine where it takes 2.0s.
> >
> > How many files are in your working directory? (hg st -marduic | wc)
>
>   301670  603340 14315588
>
> > How many of those files are tracked by hg? (hg st -mardc | wc)
>
>   101900  203800 4669033
>
> > What operating system,
>
> MacOSX 10.5
>
> > file system,
>
> hfs+
>
> > CPU,
>
> Powermac G5 2,3GHz Dual core
>
> > and storage hardware are you using?
>
> SATA-I disk.

Hmm. Looks like you've got a bunch of things working against you there:
lots of files, older machine, slow OS/FS. MacOS routinely gets poor
results in benchmarks like our test suite, apparently largely due to
slow stat() performance on HFS+. I don't know if anyone's dug very
deeply into this though.

Checking all file timestamps/sizes for modifications is an unavoidable
consequence of Mercurial's "no explicit file check-out" approach, but
for most projects the convenience is well worth the performance hit.

One possible way forward would be to implement something like hg's
Linux-only inotify for MacOS, which starts a daemon to watch for file
status changes. On the 95k file repo mentioned above, time for the
status command drops from 2.0s to 0.050s. A 40x improvement would
probably make things much more comfortable.

--
http://selenic.com : development and support for Mercurial and Linux


_______________________________________________
Mercurial mailing list
Mercurial@...
http://selenic.com/mailman/listinfo/mercurial

Re: request: speedup mercurial

by Martin Geisler-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Chad Dombrova <chadrik@...> writes:

> Is there a wiki page on best practices to maintain optimal repo
> performance? I would be very interested in this.

Well, Mercurial is designed to run fast by default :-) I don't think
there are any "performance tricks" available except for using inotify if
you're on Linux. The inotify extension used to be buggy, but it was
cleaned up in this years Google Summer of Code by Nicolas.

--
Martin Geisler

VIFF (Virtual Ideal Functionality Framework) brings easy and efficient
SMPC (Secure Multiparty Computation) to Python. See: http://viff.dk/.


_______________________________________________
Mercurial mailing list
Mercurial@...
http://selenic.com/mailman/listinfo/mercurial

attachment0 (203 bytes) Download Attachment

Re: request: speedup mercurial

by chadrik :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


> Chad Dombrova <chadrik@...> writes:
>
>> Is there a wiki page on best practices to maintain optimal repo
>> performance? I would be very interested in this.
>
> Well, Mercurial is designed to run fast by default :-)

Certainly there are some pitfalls to avoid. For example, matt  
mentioned hfs+ as a poor performer. Also we know that large binary  
files are best to avoid due to delta compression and memory  
consumption.  I'm just curious if there are other issues I should take  
into consideration.

> I don't think
> there are any "performance tricks" available except for using  
> inotify if
> you're on Linux. The inotify extension used to be buggy, but it was
> cleaned up in this years Google Summer of Code by Nicolas.

I've read that there is an inotify alpha for osx leopard. Has anyone  
had any luck with this?


>
> --
> Martin Geisler
>
> VIFF (Virtual Ideal Functionality Framework) brings easy and efficient
> SMPC (Secure Multiparty Computation) to Python. See: http://viff.dk/.
_______________________________________________
Mercurial mailing list
Mercurial@...
http://selenic.com/mailman/listinfo/mercurial

Re: request: speedup mercurial

by Dirkjan Ochtman :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sat, Nov 7, 2009 at 18:31, Matt Mackall <mpm@...> wrote:
> One possible way forward would be to implement something like hg's
> Linux-only inotify for MacOS, which starts a daemon to watch for file
> status changes. On the 95k file repo mentioned above, time for the
> status command drops from 2.0s to 0.050s. A 40x improvement would
> probably make things much more comfortable.

I'm pretty sure Nicholas implemented at least part of that for the
GSoC. Maybe have a look at bitbucket.org/nicdumz?

Cheers,

Dirkjan
_______________________________________________
Mercurial mailing list
Mercurial@...
http://selenic.com/mailman/listinfo/mercurial

Re: request: speedup mercurial

by Martin Geisler-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Chad Dombrova <chadrik@...> writes:

>> Chad Dombrova <chadrik@...> writes:
>>
>>> Is there a wiki page on best practices to maintain optimal repo
>>> performance? I would be very interested in this.
>>
>> Well, Mercurial is designed to run fast by default :-)
>
> Certainly there are some pitfalls to avoid. For example, matt
> mentioned hfs+ as a poor performer. Also we know that large binary
> files are best to avoid due to delta compression and memory
> consumption.  I'm just curious if there are other issues I should take
> into consideration.
Right, sorry about the lame answer. I was not thinking of that kind of
issues, since they are sort of external to Mercurial. I was more
thinking of options you could set to turn on "turbo-mode" :-)

--
Martin Geisler

VIFF (Virtual Ideal Functionality Framework) brings easy and efficient
SMPC (Secure Multiparty Computation) to Python. See: http://viff.dk/.


_______________________________________________
Mercurial mailing list
Mercurial@...
http://selenic.com/mailman/listinfo/mercurial

attachment0 (203 bytes) Download Attachment

Re: request: speedup mercurial

by James Walker-12 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Matt Mackall wrote:

> Hmm. Looks like you've got a bunch of things working against you there:
> lots of files, older machine, slow OS/FS. MacOS routinely gets poor
> results in benchmarks like our test suite, apparently largely due to
> slow stat() performance on HFS+. I don't know if anyone's dug very
> deeply into this though.
>
> Checking all file timestamps/sizes for modifications is an unavoidable
> consequence of Mercurial's "no explicit file check-out" approach, but
> for most projects the convenience is well worth the performance hit.

I expect you prefer to stick with cross-platform code.  But in Mac
native code, to get times and sizes for all files in a directory, I'd
use FSGetCatalogInfoBulk.

_______________________________________________
Mercurial mailing list
Mercurial@...
http://selenic.com/mailman/listinfo/mercurial

Re: request: speedup mercurial

by Matt Mackall :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sat, 2009-11-07 at 15:09 -0800, James Walker wrote:

> Matt Mackall wrote:
>
> > Hmm. Looks like you've got a bunch of things working against you there:
> > lots of files, older machine, slow OS/FS. MacOS routinely gets poor
> > results in benchmarks like our test suite, apparently largely due to
> > slow stat() performance on HFS+. I don't know if anyone's dug very
> > deeply into this though.
> >
> > Checking all file timestamps/sizes for modifications is an unavoidable
> > consequence of Mercurial's "no explicit file check-out" approach, but
> > for most projects the convenience is well worth the performance hit.
>
> I expect you prefer to stick with cross-platform code.  But in Mac
> native code, to get times and sizes for all files in a directory, I'd
> use FSGetCatalogInfoBulk.

I'd be surprised if it made a significant difference, but I wouldn't
mind seeing benchmarks.

--
http://selenic.com : development and support for Mercurial and Linux


_______________________________________________
Mercurial mailing list
Mercurial@...
http://selenic.com/mailman/listinfo/mercurial

Re: request: speedup mercurial

by James Walker-12 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Matt Mackall wrote:

> On Sat, 2009-11-07 at 15:09 -0800, James Walker wrote:
>> Matt Mackall wrote:
>>
>>> Hmm. Looks like you've got a bunch of things working against you there:
>>> lots of files, older machine, slow OS/FS. MacOS routinely gets poor
>>> results in benchmarks like our test suite, apparently largely due to
>>> slow stat() performance on HFS+. I don't know if anyone's dug very
>>> deeply into this though.
>>>
>>> Checking all file timestamps/sizes for modifications is an unavoidable
>>> consequence of Mercurial's "no explicit file check-out" approach, but
>>> for most projects the convenience is well worth the performance hit.
>> I expect you prefer to stick with cross-platform code.  But in Mac
>> native code, to get times and sizes for all files in a directory, I'd
>> use FSGetCatalogInfoBulk.
>
> I'd be surprised if it made a significant difference, but I wouldn't
> mind seeing benchmarks.

Looks like I'm wrong.  I did test programs to add up the sizes of files
in a big directory two ways, using scandir and stat in one program, and
using FSGetCatalogInfoBulk in the other.  There was a fair amount of
variation, but the scandir/stat method was faster on average.
_______________________________________________
Mercurial mailing list
Mercurial@...
http://selenic.com/mailman/listinfo/mercurial

Re: request: speedup mercurial

by Nicolas Dumazet :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

2009/11/8 Dirkjan Ochtman <dirkjan@...>:
> On Sat, Nov 7, 2009 at 18:31, Matt Mackall <mpm@...> wrote:
>> One possible way forward would be to implement something like hg's
>> Linux-only inotify for MacOS, which starts a daemon to watch for file
>> status changes. On the 95k file repo mentioned above, time for the
>> status command drops from 2.0s to 0.050s. A 40x improvement would
>> probably make things much more comfortable.
>
> I'm pretty sure Nicholas implemented at least part of that for the
> GSoC. Maybe have a look at bitbucket.org/nicdumz?

I somehow managed to miss this whole thread until now. I need to
improvie my mail client configuration :(

Yes, support has been implemented as part of my GSoC. I'm pretty sure
it still has bugs, given the nature of the problem, but there is a
working inotify implementation in the wild, yes.

Anyway, I will try to spend some time in the upcoming days to update
the patches sitting in my MQ, particularly the one reorganizing
inotify extension to allow different implementations depending on the
OS. ( http://bitbucket.org/nicdumz/mercurial-crew-mq/src/tip/separate
) It's the last step necessary before actually including OS X-related
code in inotify.

--
Nicolas Dumazet — NicDumZ

_______________________________________________
Mercurial mailing list
Mercurial@...
http://selenic.com/mailman/listinfo/mercurial