Splice/graft configuration

View: New views
4 Messages — Rating Filter:   Alert me  

Splice/graft configuration

by Max O Bowsher :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Even after (finally) landing yesterday a bunch of changes that I've had
sitting locally, still have more.

Namely, support for splicing extra merges into the history. The chief
sticking point here is a format for configuring the selection of
revisions. At the moment I'm identifying revisions by their timestamp
(post fixups) - which works, but is a bit unsatisfying, and I keep
wondering if I'm going to run into an edge case with 2 revisions with
the same timestamp.

Does anyone have any thoughts on how best to write "graft configuration"
for cvs2*?

Max.

------------------------------------------------------
http://cvs2svn.tigris.org/ds/viewMessage.do?dsForumId=1667&dsMessageId=2415380

To unsubscribe from this discussion, e-mail: [dev-unsubscribe@...].

signature.asc (204 bytes) Download Attachment

Re: Splice/graft configuration

by mhagger :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Max Bowsher wrote:

> Even after (finally) landing yesterday a bunch of changes that I've had
> sitting locally, still have more.
>
> Namely, support for splicing extra merges into the history. The chief
> sticking point here is a format for configuring the selection of
> revisions. At the moment I'm identifying revisions by their timestamp
> (post fixups) - which works, but is a bit unsatisfying, and I keep
> wondering if I'm going to run into an edge case with 2 revisions with
> the same timestamp.
>
> Does anyone have any thoughts on how best to write "graft configuration"
> for cvs2*?

I was happy a while back to discover that git has support for a "grafts"
file that allows the apparent parentage of historical revisions to be
faked.  It is also very easy to use "git filter-branch" to rewrite the
git repository to incorporate the parentage changes permanently (though
of course this changes the SHA1s of all subsequent commits).  So for
cvs2git, the simplest approach is probably to use git's facility
post-conversion rather than trying to invent a new "graft configuration"
for cvs2svn.

On the other hand, the grafts file cannot be written until the final
cvs2git conversion is done, because before then the SHA1s are not
predictable (even those from a test conversion might not be consistent
with those from the final conversion).  And yet the repository cannot be
published until the post-grafts rewrite is done because only then are
the *final* SHA1s determined.

Presumably bzr and hg do not have such a facility.  I would almost
suggest adding the "grafts" facility to the target VCS rather than
adding it to cvs2svn.  This would allow the user to use the VCS's
standard visualization tools to help figure out which commits should be
treated as merges, and the feature might be useful in other contexts
unrelated to cvs2xxx.

But if you want to implement this in cvs2xxx, then another possibility
would be to denote the changeset that should be a merge by simply
listing a single file revision that you know will be in the changeset,
like "src/path/file.c:1.5.6.4".  cvs2svn could figure out which
changeset this corresponds to.  Even though this would not be completely
general, it would probably be easier for the user to deal with than
timestamps that are not even generated until late in the cvs2svn
conversion (and not guaranteed to be consistent from one run to the next).

If this is added to cvs2svn, it could eventually be implemented for all
target VCSs, including even recent versions of SVN.

Michael

------------------------------------------------------
http://cvs2svn.tigris.org/ds/viewMessage.do?dsForumId=1667&dsMessageId=2415572

To unsubscribe from this discussion, e-mail: [dev-unsubscribe@...].

Re: Splice/graft configuration

by Max O Bowsher :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Michael Haggerty wrote:

> I was happy a while back to discover that git has support for a "grafts"
> file that allows the apparent parentage of historical revisions to be
> faked.  It is also very easy to use "git filter-branch" to rewrite the
> git repository to incorporate the parentage changes permanently (though
> of course this changes the SHA1s of all subsequent commits).  So for
> cvs2git, the simplest approach is probably to use git's facility
> post-conversion rather than trying to invent a new "graft configuration"
> for cvs2svn.
>
> On the other hand, the grafts file cannot be written until the final
> cvs2git conversion is done, because before then the SHA1s are not
> predictable (even those from a test conversion might not be consistent
> with those from the final conversion).  And yet the repository cannot be
> published until the post-grafts rewrite is done because only then are
> the *final* SHA1s determined.
>
> Presumably bzr and hg do not have such a facility.  I would almost
> suggest adding the "grafts" facility to the target VCS rather than
> adding it to cvs2svn.  This would allow the user to use the VCS's
> standard visualization tools to help figure out which commits should be
> treated as merges, and the feature might be useful in other contexts
> unrelated to cvs2xxx.
You'd still need to rewrite all of history to "fixate" any changes, at
which point, rerunning the cvs2xxx conversion to "fixate" your synthetic
merges isn't implausible.

> But if you want to implement this in cvs2xxx, then another possibility
> would be to denote the changeset that should be a merge by simply
> listing a single file revision that you know will be in the changeset,
> like "src/path/file.c:1.5.6.4".  cvs2svn could figure out which
> changeset this corresponds to.  Even though this would not be completely
> general, it would probably be easier for the user to deal with than
> timestamps that are not even generated until late in the cvs2svn
> conversion (and not guaranteed to be consistent from one run to the next).

I have to admit that I hacked things up using timestamps because I could
easily copy/paste them out of bzr's visual log viewer :-)

That said, I have to agree that your idea is a very neat one. With some
additional syntax to support selecting symbol creation commits, it
should work out nicely.


Max.

------------------------------------------------------
http://cvs2svn.tigris.org/ds/viewMessage.do?dsForumId=1667&dsMessageId=2415971

To unsubscribe from this discussion, e-mail: [dev-unsubscribe@...].

signature.asc (204 bytes) Download Attachment

Re: Splice/graft configuration

by Greg Ward-19 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

[Max]
> Namely, support for splicing extra merges into the history. The chief
> sticking point here is a format for configuring the selection of
> revisions. At the moment I'm identifying revisions by their timestamp
> (post fixups) - which works, but is a bit unsatisfying, and I keep
> wondering if I'm going to run into an edge case with 2 revisions with
> the same timestamp.
>
> Does anyone have any thoughts on how best to write "graft configuration"
> for cvs2*?

I've cobbled something together for cvs2hg, but it requires active
participation by an application-specific subclass of HgOutputOption.
It starts with this template method in HgOutputOption:

  def detect_merge(self, svn_commit, lod, parent1):
    """
    Determine if the CVS commit represented by SVN_COMMIT was a merge in CVS.
    If so, return the Mercurial changeset ID (in binary) of the changeset that
    should be its second parent; otherwise, return None.  The default
    implementation always returns None.

    SVN_COMMIT is a cvs2svn_lib.svn_commit.SVNPrimaryCommit object.

    LOD is an instance of either cvs2svn_lib.symbol.Trunk or
    cvs2svn_lib.symbol.Branch representing the CVS line of development where
    this commit happened.

    PARENT1 is the Mercurial changeset ID (binary) of the first parent.
    """
    return None

(I'm not sure this is the right design: I might want to allow
subclasses to specify both parents.  This sounds weird, but we have
some weird history in our CVS repo that I would like to accurately
reflect in Hg.)

My implementation of detect_merge() is simple:

    def detect_merge(self, svn_commit, lod, parent1):
        return (self._try_automerge_from(svn_commit, lod) or
                self._try_manual_merge(svn_commit, lod) or
                None)

where the _try_automerge_from() and _try_manual_merge() methods are
much more interesting.  The former parses the CVS commit message,
looking for the systematic "MERGE from X: " comments that we have been
using for the last couple of years.  The latter, _try_manual_merge()
is to handle legacy cases -- mainly the merging of old CVS development
branches.

In this case, I chose to identify commits by the tuple (branch,
timestamp, comment_prefix).  I thought about throwing username in
their just to be sure; it wouldn't hurt, but I never got around to it.

> Presumably bzr and hg do not have such a facility.  I would almost
> suggest adding the "grafts" facility to the target VCS rather than
> adding it to cvs2svn.  This would allow the user to use the VCS's
> standard visualization tools to help figure out which commits should be
> treated as merges, and the feature might be useful in other contexts
> unrelated to cvs2xxx.

hg convert's cvs module has a "look for merge messages" feature.

And hg convert in general supports a splicemap, which I gather is
vaguely similar to git's grafts.  It would probably work well
converting from a VC system with sensible revision IDs (svn, git,
...), but it's not too useful converting from CVS.

Greg

------------------------------------------------------
http://cvs2svn.tigris.org/ds/viewMessage.do?dsForumId=1667&dsMessageId=2419260

To unsubscribe from this discussion, e-mail: [dev-unsubscribe@...].