I think your refactoring looks like an improvement. The only question I
have is why we would need to have a fully parsed DatabaseChangeLog
object returned? I am afraid that keeping an entire changeset in memory
will be a scalability problem for us. I think it would be best to
implement the ChangeLogParser like the ResultSet object where it only
holds on to one change set a time since the visitors that are passed to
the iterator only need to deal with the current changeSet.
A couple other thoughts:
- Besides ChangeSetVisitior, I think we should have a
- Besides the ContextFilter we would need a DBMSFilter so I think we
should create a ChangeSetFilter interface which ChangeLogIterator can
take an array of. All Filters in the array need to return true for the
iterator to pass the changeSet off to the visitors
- There will be more visitors over time. One I know of off hand is a
Do you think it would best done in a branch, or in one big commit? It
seems like a large enough change that it would be best to create a
branch with it so the refactoring can be done with a couple
commit-points while not breaking the code for others.
I'm also thinking I would like to push off releasing the changes until
the 1.3 release. I would like to release 1.2 the last week of August,
and I will be gone for a week in the middle of August, so I don't think
the changes will be able to be made and tested enough. If we make the
changes in a branch and things go well, we could merge it in before the
release, but if it takes longer than we think it won't hold up the
I have given the Migrator refactor some thought and there is a
reasonable amount to do to help factor out the key elements of the core.
My guiding principle is that we want to be able to add functionality
without needing to modify this core and to do so major actions on a
ChangeSet need to be in their own class. The Migrator class should be a
Facade and contain no real code other than using the internal system
The Migrator currently contains code to do the following:
1) Setting up the parser, capturing errors and running the parser.
2) Creation of the Database connection.
3) Answers the question of "has this changeset been ran?"
4) Lock waiting and acquisition
5) Determines the ran status of change logs
6) Creates a Swing GUI
7) Determines if a context matches for any given changeset
8) A lot of code and if statements around the update, rollback,
outputting sql etc
9) The most problematic issue is the unclear interface that requires the
right setters to be called before "migrate" is run to do whatever has
I have attached a UML (ish) diagram which shows the following
1) Changes the interface of Migrator so that all user actions have an
explicit method which take in any parameters required.
2) The parser is now self contained and returns a DatabaseChangeLog
object (completely parsers and holds in memory the entire changelog).
3) A LockManager which will remove some of the current code to do with
waiting and acquiring locks in the database.
4) A ChangeLogIterator which takes a DatabaseChangeLog and goes through
and finds all of the ChangeSets (filtering as appropaite, at the moment
we have a context filter but there may be others we need).
5) An entirely new concept for ChangeSetVistor's which are passed
ChangeSet's and execute in the way needed - the Update one obviously
determines if the ChangeSet needs running on the DB and then runs it.
The rollback one determines which ones to remove etc. There is likely
more of them than I've listed and a more complex hierarchy (since we
need output of SQL I think an engine for receiving the SQL might be a
good idea also).
6) The Migrator takes some DatabaseSettings in the constructor and I
believe we should pass those on to the DatabaseFactory directly for
Attached is a PNG with the diagram and also a DIA diagram file incase
you want to modify and resend (dia is an open source light weight