pcvs2git.pike

View: New views
13 Messages — Rating Filter:   Alert me  

pcvs2git.pike

by Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I just tried converting Roxen 2.4:

Stage Memory use Commits
---------------------------------------
Import 339 MB ~17000
Raking 359 MB ~17000
Verify 361 MB ~17000
Merging 662 MB 11886
Graphing 803 MB 10897
Generate 803 MB 10897

I believe that the memory use can be reduced by using more custom
datatypes (currently there are a lot of mappings generated in the
merging and graphing stages). Another way to reduce the memory use
is to partition the graphs in the time axis.

pcvs2git.pike

by Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

>Graphing 803 MB 10897

I just tried switching from mappings to bitmasks implemented with
bignums, and got:

Graphing 130 MB 10897

Which I think is acceptable.

pcvs2git.pike

by Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Sounds very resonable.

pcvs2git.pike

by Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

>>Graphing 803 MB 10897
>
>I just tried switching from mappings to bitmasks implemented with
>bignums, and got:
>
>Graphing 130 MB 10897

I've now tested with Roxen 5.0 (~24250 revisions):

Graphing 256 MB 15331

pcvs2git.pike

by Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I've now rewritten the last stage of the importer so that it is ~2
times faster than before.

pcvs2git.pike

by Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

FYI: The commit merging code in the above script was broken prior to
today, and could sometimes reintroduce old file revisions.

pcvs2git.pike

by Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

And now I've tried it with the entirety of Pike & ulpc in one go.
Number of revisions:   61660
Number of commits:     34847
Memory before graphing: ~900 MB
Memory max:             2657 MB

I believe that the memory use can be reduced a bit further by
detecting some common cases.

pcvs2git.pike

by Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hm, the number of revisions sound a bit high; I have 36620 revisions
in the SVN repository of Pike and ulpc up to end of August.  Do you
create one revision for each file when there are commits touching
multiple files?

pcvs2git.pike

by Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Yes, revision = one revision in one RCS-file. Commit = a set of
revisions that were made at the same time (within 5min) belonging to
different RCS-files done by the same user and having the same log
message.

pcvs2git.pike

by Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

>And now I've tried it with the entirety of Pike & ulpc in one go.
>Number of revisions:   61660
>Number of commits:     34847
>Memory before graphing: ~900 MB
>Memory max:             2657 MB
>
>I believe that the memory use can be reduced a bit further by
>detecting some common cases.

With some simple garbage collection, I've now gotten it to keep
memory use steady at under 900 MB. Which has the additional benefit
of speeding up the actual committing phase (from ~1.6 commits/s to
~3.9 commits/s on my machine). Committing the entirety of Pike thus
goes from ~10.7 hours to ~4.4 hours (much better, but still quite a
bit too long though...).

pcvs2git.pike

by Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

>With some simple garbage collection, I've now gotten it to keep
>memory use steady at under 900 MB. Which has the additional benefit
>of speeding up the actual committing phase (from ~1.6 commits/s to
>~3.9 commits/s on my machine). Committing the entirety of Pike thus
>goes from ~10.7 hours to ~4.4 hours (much better, but still quite a
>bit too long though...).

By rewriting the script to use git fast-import instead, it's now
chugging along nicely at ~12.5 commits/s (ie ~47 minutes).

Re: pcvs2git.pike

by Stephen R. van den Berg :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Henrik Grubbstr?m (Lysator) @ Pike (-) developers forum wrote:
>>With some simple garbage collection, I've now gotten it to keep
>>memory use steady at under 900 MB. Which has the additional benefit
>>of speeding up the actual committing phase (from ~1.6 commits/s to
>>~3.9 commits/s on my machine). Committing the entirety of Pike thus
>>goes from ~10.7 hours to ~4.4 hours (much better, but still quite a
>>bit too long though...).

>By rewriting the script to use git fast-import instead, it's now
>chugging along nicely at ~12.5 commits/s (ie ~47 minutes).

Now we're getting somewhere.  Nice work!
--
Sincerely,
           Stephen R. van den Berg.
"Science is like sex: sometimes something useful comes out,
 but that is not the reason we are doing it."  --  Richard Feynman

pcvs2git.pike

by Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

>By rewriting the script to use git fast-import instead, it's now
>chugging along nicely at ~12.5 commits/s (ie ~47 minutes).

An almost useable export of Pike is now available from
git://pike-git.lysator.liu.se/pike-new-alpha2

One known issue is commit 4435a84964353881596269079577725c1c9951e3,
which is due to src/test/.cvsignore and src/test/create_testsuite
not having been killed properly in the transition to Pike 7.2.

Another issue is that all commits are currently credited to the
committer (ie not necessarily the author). I'll try to extract
the information from git://pike-git.lysator.liu.se/pike.git.

Feed-back appreciated.

        /grubba