Refactoring the Migrator

View: New views
4 Messages — Rating Filter:   Alert me  

Refactoring the Migrator

by Paul Keeble :: Rate this Message:

| View Threaded | Show Only this Message

I have given the Migrator refactor some thought and there is a reasonable amount to do to help factor out the key elements of the core. My guiding principle is that we want to be able to add functionality without needing to modify this core and to do so major actions on a ChangeSet need to be in their own class. The Migrator class should be a Facade and contain no real code other than using the internal system components.

The Migrator currently contains code to do the following:
1) Setting up the parser, capturing errors and running the parser.
2) Creation of the Database connection.
3) Answers the question of "has this changeset been ran?"
4) Lock waiting and acquisition
5) Determines the ran status of change logs
6) Creates a Swing GUI
7) Determines if a context matches for any given changeset
8) A lot of code and if statements around the update, rollback, outputting sql etc
9) The most problematic issue is the unclear interface that requires the right setters to be called before "migrate" is run to do whatever has been setup.

I have attached a UML (ish) diagram which shows the following refactoring changes:
1) Changes the interface of Migrator so that all user actions have an explicit method which take in any parameters required.
2) The parser is now self contained and returns a DatabaseChangeLog object (completely parsers and holds in memory the entire changelog).
3) A LockManager which will remove some of the current code to do with waiting and acquiring locks in the database.
4) A ChangeLogIterator which takes a DatabaseChangeLog and goes through and finds all of the ChangeSets (filtering as appropaite, at the moment we have a context filter but there may be others we need).
5) An entirely new concept for ChangeSetVistor's which are passed ChangeSet's and execute in the way needed - the Update one obviously determines if the ChangeSet needs running on the DB and then runs it. The rollback one determines which ones to remove etc. There is likely more of them than I've listed and a more complex hierarchy (since we need output of SQL I think an engine for receiving the SQL might be a good idea also).
6) The Migrator takes some DatabaseSettings in the constructor and I believe we should pass those on to the DatabaseFactory directly for Connection creation

Attached is a PNG with the diagram and also a DIA diagram file incase you want to modify and resend (dia is an open source light weight drawing tool).

Paul K



      ___________________________________________________________
Yahoo! Mail is the world's favourite email. Don't settle for less, sign up for
your free account today http://uk.rd.yahoo.com/evt=44106/*http://uk.docs.yahoo.com/mail/winter07.html


-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
Liquibase-devel mailing list
Liquibase-devel@...
https://lists.sourceforge.net/lists/listinfo/liquibase-devel

=?utf-8?q?MigratorRefactor.dia?= (6K) Download Attachment
=?utf-8?q?MigratorRefactor.png?= (114K) Download Attachment

Re: Refactoring the Migrator

by Nathan Voxland-2 :: Rate this Message:

| View Threaded | Show Only this Message

I think your refactoring looks like an improvement.  The only question I
have is why we would need to have a fully parsed DatabaseChangeLog
object returned?  I am afraid that keeping an entire changeset in memory
will be a scalability problem for us.  I think it would be best to
implement the ChangeLogParser like the ResultSet object where it only
holds on to one change set a time since the visitors that are passed to
the iterator only need to deal with the current changeSet.

A couple other thoughts:
- Besides ChangeSetVisitior, I think we should have a
PreConditionVisitor
- Besides the ContextFilter we would need a DBMSFilter so I think we
should create a ChangeSetFilter interface which ChangeLogIterator can
take an array of.  All Filters in the array need to return true for the
iterator to pass the changeSet off to the visitors
- There will be more visitors over time.  One I know of off hand is a
DocumentationGeneratorVisitor

Do you think it would best done in a branch, or in one big commit?  It
seems like a large enough change that it would be best to create a
branch with it so the refactoring can be done with a couple
commit-points while not breaking the code for others.

I'm also thinking I would like to push off releasing the changes until
the 1.3 release.  I would like to release 1.2 the last week of August,
and I will be gone for a week in the middle of August, so I don't think
the changes will be able to be made and tested enough.  If we make the
changes in a branch and things go well, we could merge it in before the
release, but if it takes longer than we think it won't hold up the
release.

Nathan

-----Original Message-----
From: liquibase-devel-bounces@...
[mailto:liquibase-devel-bounces@...] On Behalf Of Paul
Keeble
Sent: Saturday, August 04, 2007 1:53 PM
To: liquibase-devel@...
Subject: [Liquibase-devel] Refactoring the Migrator

I have given the Migrator refactor some thought and there is a
reasonable amount to do to help factor out the key elements of the core.
My guiding principle is that we want to be able to add functionality
without needing to modify this core and to do so major actions on a
ChangeSet need to be in their own class. The Migrator class should be a
Facade and contain no real code other than using the internal system
components.

The Migrator currently contains code to do the following:
1) Setting up the parser, capturing errors and running the parser.
2) Creation of the Database connection.
3) Answers the question of "has this changeset been ran?"
4) Lock waiting and acquisition
5) Determines the ran status of change logs
6) Creates a Swing GUI
7) Determines if a context matches for any given changeset
8) A lot of code and if statements around the update, rollback,
outputting sql etc
9) The most problematic issue is the unclear interface that requires the
right setters to be called before "migrate" is run to do whatever has
been setup.

I have attached a UML (ish) diagram which shows the following
refactoring changes:
1) Changes the interface of Migrator so that all user actions have an
explicit method which take in any parameters required.
2) The parser is now self contained and returns a DatabaseChangeLog
object (completely parsers and holds in memory the entire changelog).
3) A LockManager which will remove some of the current code to do with
waiting and acquiring locks in the database.
4) A ChangeLogIterator which takes a DatabaseChangeLog and goes through
and finds all of the ChangeSets (filtering as appropaite, at the moment
we have a context filter but there may be others we need).
5) An entirely new concept for ChangeSetVistor's which are passed
ChangeSet's and execute in the way needed - the Update one obviously
determines if the ChangeSet needs running on the DB and then runs it.
The rollback one determines which ones to remove etc. There is likely
more of them than I've listed and a more complex hierarchy (since we
need output of SQL I think an engine for receiving the SQL might be a
good idea also).
6) The Migrator takes some DatabaseSettings in the constructor and I
believe we should pass those on to the DatabaseFactory directly for
Connection creation

Attached is a PNG with the diagram and also a DIA diagram file incase
you want to modify and resend (dia is an open source light weight
drawing tool).

Paul K



      ___________________________________________________________
Yahoo! Mail is the world's favourite email. Don't settle for less, sign
up for
your free account today
http://uk.rd.yahoo.com/evt=44106/*http://uk.docs.yahoo.com/mail/winter07
.html

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
Liquibase-devel mailing list
Liquibase-devel@...
https://lists.sourceforge.net/lists/listinfo/liquibase-devel

Parent Message unknown Re: Refactoring the Migrator

by Paul Keeble :: Rate this Message:

| View Threaded | Show Only this Message

The main reason I'd want to create a fully parsed ChangeLog is to improve performance (only iterate the ChangeLog once) and to provide more separation of the stages. Its a cleaner design to have the loading of the domain model separate from how it is processed - it would greatly ease the exception handling as well. If we look at just the default amount of memory then we might start hitting problems at change logs around 30MB and that at a guess 5% of our users might generate. Its certainly going to get hit at some point.

We could always do both - convert the current parser into a ChangeLogIterator implementation and if the ChangeLog is above X bytes we fall back to multiple parser calls. Its more complex and all it really gives us is extra testing for the same functionality done twice. I see no reason why we can't just use the current scheme and create ourselves an exception Utility to try and get more information out from the SAX approach. Will still need a loader, but it'll just setup the SAX parser and the error handler.

To your thoughts
1) Yes good idea, maybe we should further extrapolate to a ChangeLog iterator that calls the sub iterators for there respective pieces, so we've got somewhere to add more functionality at the top level. Only problem is with the SAX approach this further separation is going to cost us another parse through (keeping them separated from each other is a good idea but its costing us a bit of performance). Still validation is a separate stage to processing so I don't think its too significant.
2) Yes agreed
3) Absolutely, and I think this paradigm shift will allow us to add lots of functionality with just a tweak in the available Visitors so hopefully the documentation generator will be easier as will adding main features that rely on a ChangeSet level functionality.

I think it probably is a good idea to separate it from the coming release, not because it will take a long time but because it may cause regression problems and our automated testing isn't that good yet. We have two options in terms of branches, we can push the refactoring into a branch, or we can push the upcoming release fixes into a branch. Generally I prefer to push the release fixes into the branch as they tend to be smaller and more confined than core refactorings which makes them easier to merge, but its up to you. Let me know the branch name or otherwise and I'll start messing with it.

Paul

----- Original Message ----
From: Nathan Voxland <Nathan.Voxland@...>
To: Paul Keeble <csuml@...>; liquibase-devel@...
Sent: Monday, 6 August, 2007 3:30:57 PM
Subject: RE: [Liquibase-devel] Refactoring the Migrator

I think your refactoring looks like an improvement.  The only question I
have is why we would need to have a fully parsed DatabaseChangeLog
object returned?  I am afraid that keeping an entire changeset in memory
will be a scalability problem for us.  I think it would be best to
implement the ChangeLogParser like the ResultSet object where it only
holds on to one change set a time since the visitors that are passed to
the iterator only need to deal with the current changeSet.

A couple other thoughts:
- Besides ChangeSetVisitior, I think we should have a
PreConditionVisitor
- Besides the ContextFilter we would need a DBMSFilter so I think we
should create a ChangeSetFilter interface which ChangeLogIterator can
take an array of.  All Filters in the array need to return true for the
iterator to pass the changeSet off to the visitors
- There will be more visitors over time.  One I know of off hand is a
DocumentationGeneratorVisitor

Do you think it would best done in a branch, or in one big commit?  It
seems like a large enough change that it would be best to create a
branch with it so the refactoring can be done with a couple
commit-points while not breaking the code for others.

I'm also thinking I would like to push off releasing the changes until
the 1.3 release.  I would like to release 1.2 the last week of August,
and I will be gone for a week in the middle of August, so I don't think
the changes will be able to be made and tested enough.  If we make the
changes in a branch and things go well, we could merge it in before the
release, but if it takes longer than we think it won't hold up the
release.

Nathan

-----Original Message-----
From: liquibase-devel-bounces@...
[mailto:liquibase-devel-bounces@...] On Behalf Of Paul
Keeble
Sent: Saturday, August 04, 2007 1:53 PM
To: liquibase-devel@...
Subject: [Liquibase-devel] Refactoring the Migrator

I have given the Migrator refactor some thought and there is a
reasonable amount to do to help factor out the key elements of the core.
My guiding principle is that we want to be able to add functionality
without needing to modify this core and to do so major actions on a
ChangeSet need to be in their own class. The Migrator class should be a
Facade and contain no real code other than using the internal system
components.

The Migrator currently contains code to do the following:
1) Setting up the parser, capturing errors and running the parser.
2) Creation of the Database connection.
3) Answers the question of "has this changeset been ran?"
4) Lock waiting and acquisition
5) Determines the ran status of change logs
6) Creates a Swing GUI
7) Determines if a context matches for any given changeset
8) A lot of code and if statements around the update, rollback,
outputting sql etc
9) The most problematic issue is the unclear interface that requires the
right setters to be called before "migrate" is run to do whatever has
been setup.

I have attached a UML (ish) diagram which shows the following
refactoring changes:
1) Changes the interface of Migrator so that all user actions have an
explicit method which take in any parameters required.
2) The parser is now self contained and returns a DatabaseChangeLog
object (completely parsers and holds in memory the entire changelog).
3) A LockManager which will remove some of the current code to do with
waiting and acquiring locks in the database.
4) A ChangeLogIterator which takes a DatabaseChangeLog and goes through
and finds all of the ChangeSets (filtering as appropaite, at the moment
we have a context filter but there may be others we need).
5) An entirely new concept for ChangeSetVistor's which are passed
ChangeSet's and execute in the way needed - the Update one obviously
determines if the ChangeSet needs running on the DB and then runs it.
The rollback one determines which ones to remove etc. There is likely
more of them than I've listed and a more complex hierarchy (since we
need output of SQL I think an engine for receiving the SQL might be a
good idea also).
6) The Migrator takes some DatabaseSettings in the constructor and I
believe we should pass those on to the DatabaseFactory directly for
Connection creation

Attached is a PNG with the diagram and also a DIA diagram file incase
you want to modify and resend (dia is an open source light weight
drawing tool).

Paul K



      ___________________________________________________________
Yahoo! Mail is the world's favourite email. Don't settle for less, sign
up for
your free account today
http://uk.rd.yahoo.com/evt=44106/*http://uk.docs.yahoo.com/mail/winter07
.html





      ___________________________________________________________
Yahoo! Answers - Got a question? Someone out there knows the answer. Try it
now.
http://uk.answers.yahoo.com/ 

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
Liquibase-devel mailing list
Liquibase-devel@...
https://lists.sourceforge.net/lists/listinfo/liquibase-devel

Re: Refactoring the Migrator

by Nathan Voxland-2 :: Rate this Message:

| View Threaded | Show Only this Message

Do you think we would need to do processing of the change logs outside
the visitor pattern?  We could initially implement the ChangeLog
iterator to read the entire changelog in and just iterate over the
in-memory representation when needed rather than re-parse the XML and
see how that goes.  As long as we don't pass the fully-parsed change log
outside the iterator, it would hide the fact that we parse it all vs.
parse for each iteration.  Like I had said before, I don't have the
measurements to know how the memory vs. processing time tradeoff goes.
I don't think we should implement both full-parse and
parse-per-iteration, but I would like to at least hide how we are
currently do it so we can teak it and/or swap it out without breaking
other code.  

With the various types of visitors, I don't think we need to do multiple
runs through SAX necessarily.  If we can pass all types of visitors to
the iterator we could verify the preconditions and the change sets at
the same time.  There would still be two trips through the change log
(one for validation, one for execution), but I think that's what we need
to do.

I think I'll make a branch for this refactoring rather than for the
release.  That way if we get pat way through and decide a different
approach would be better, or it just starts getting ugly, it is easier
to abandon the branch and try again from trunk.  It's also less
confusing for other developers (just work off the trunk).

I'm going to merge my eclipse branch into trunk today and then I'll make
a new branch for you.

Nathan

-----Original Message-----
From: liquibase-devel-bounces@...
[mailto:liquibase-devel-bounces@...] On Behalf Of Paul
Keeble
Sent: Tuesday, August 07, 2007 7:13 AM
To: Nathan Voxland; liquibase-devel@...
Subject: Re: [Liquibase-devel] Refactoring the Migrator

The main reason I'd want to create a fully parsed ChangeLog is to
improve performance (only iterate the ChangeLog once) and to provide
more separation of the stages. Its a cleaner design to have the loading
of the domain model separate from how it is processed - it would greatly
ease the exception handling as well. If we look at just the default
amount of memory then we might start hitting problems at change logs
around 30MB and that at a guess 5% of our users might generate. Its
certainly going to get hit at some point.

We could always do both - convert the current parser into a
ChangeLogIterator implementation and if the ChangeLog is above X bytes
we fall back to multiple parser calls. Its more complex and all it
really gives us is extra testing for the same functionality done twice.
I see no reason why we can't just use the current scheme and create
ourselves an exception Utility to try and get more information out from
the SAX approach. Will still need a loader, but it'll just setup the SAX
parser and the error handler.

To your thoughts
1) Yes good idea, maybe we should further extrapolate to a ChangeLog
iterator that calls the sub iterators for there respective pieces, so
we've got somewhere to add more functionality at the top level. Only
problem is with the SAX approach this further separation is going to
cost us another parse through (keeping them separated from each other is
a good idea but its costing us a bit of performance). Still validation
is a separate stage to processing so I don't think its too significant.
2) Yes agreed
3) Absolutely, and I think this paradigm shift will allow us to add lots
of functionality with just a tweak in the available Visitors so
hopefully the documentation generator will be easier as will adding main
features that rely on a ChangeSet level functionality.

I think it probably is a good idea to separate it from the coming
release, not because it will take a long time but because it may cause
regression problems and our automated testing isn't that good yet. We
have two options in terms of branches, we can push the refactoring into
a branch, or we can push the upcoming release fixes into a branch.
Generally I prefer to push the release fixes into the branch as they
tend to be smaller and more confined than core refactorings which makes
them easier to merge, but its up to you. Let me know the branch name or
otherwise and I'll start messing with it.

Paul

----- Original Message ----
From: Nathan Voxland <Nathan.Voxland@...>
To: Paul Keeble <csuml@...>;
liquibase-devel@...
Sent: Monday, 6 August, 2007 3:30:57 PM
Subject: RE: [Liquibase-devel] Refactoring the Migrator

I think your refactoring looks like an improvement.  The only question I
have is why we would need to have a fully parsed DatabaseChangeLog
object returned?  I am afraid that keeping an entire changeset in memory
will be a scalability problem for us.  I think it would be best to
implement the ChangeLogParser like the ResultSet object where it only
holds on to one change set a time since the visitors that are passed to
the iterator only need to deal with the current changeSet.

A couple other thoughts:
- Besides ChangeSetVisitior, I think we should have a
PreConditionVisitor
- Besides the ContextFilter we would need a DBMSFilter so I think we
should create a ChangeSetFilter interface which ChangeLogIterator can
take an array of.  All Filters in the array need to return true for the
iterator to pass the changeSet off to the visitors
- There will be more visitors over time.  One I know of off hand is a
DocumentationGeneratorVisitor

Do you think it would best done in a branch, or in one big commit?  It
seems like a large enough change that it would be best to create a
branch with it so the refactoring can be done with a couple
commit-points while not breaking the code for others.

I'm also thinking I would like to push off releasing the changes until
the 1.3 release.  I would like to release 1.2 the last week of August,
and I will be gone for a week in the middle of August, so I don't think
the changes will be able to be made and tested enough.  If we make the
changes in a branch and things go well, we could merge it in before the
release, but if it takes longer than we think it won't hold up the
release.

Nathan

-----Original Message-----
From: liquibase-devel-bounces@...
[mailto:liquibase-devel-bounces@...] On Behalf Of Paul
Keeble
Sent: Saturday, August 04, 2007 1:53 PM
To: liquibase-devel@...
Subject: [Liquibase-devel] Refactoring the Migrator

I have given the Migrator refactor some thought and there is a
reasonable amount to do to help factor out the key elements of the core.
My guiding principle is that we want to be able to add functionality
without needing to modify this core and to do so major actions on a
ChangeSet need to be in their own class. The Migrator class should be a
Facade and contain no real code other than using the internal system
components.

The Migrator currently contains code to do the following:
1) Setting up the parser, capturing errors and running the parser.
2) Creation of the Database connection.
3) Answers the question of "has this changeset been ran?"
4) Lock waiting and acquisition
5) Determines the ran status of change logs
6) Creates a Swing GUI
7) Determines if a context matches for any given changeset
8) A lot of code and if statements around the update, rollback,
outputting sql etc
9) The most problematic issue is the unclear interface that requires the
right setters to be called before "migrate" is run to do whatever has
been setup.

I have attached a UML (ish) diagram which shows the following
refactoring changes:
1) Changes the interface of Migrator so that all user actions have an
explicit method which take in any parameters required.
2) The parser is now self contained and returns a DatabaseChangeLog
object (completely parsers and holds in memory the entire changelog).
3) A LockManager which will remove some of the current code to do with
waiting and acquiring locks in the database.
4) A ChangeLogIterator which takes a DatabaseChangeLog and goes through
and finds all of the ChangeSets (filtering as appropaite, at the moment
we have a context filter but there may be others we need).
5) An entirely new concept for ChangeSetVistor's which are passed
ChangeSet's and execute in the way needed - the Update one obviously
determines if the ChangeSet needs running on the DB and then runs it.
The rollback one determines which ones to remove etc. There is likely
more of them than I've listed and a more complex hierarchy (since we
need output of SQL I think an engine for receiving the SQL might be a
good idea also).
6) The Migrator takes some DatabaseSettings in the constructor and I
believe we should pass those on to the DatabaseFactory directly for
Connection creation

Attached is a PNG with the diagram and also a DIA diagram file incase
you want to modify and resend (dia is an open source light weight
drawing tool).

Paul K



      ___________________________________________________________
Yahoo! Mail is the world's favourite email. Don't settle for less, sign
up for
your free account today
http://uk.rd.yahoo.com/evt=44106/*http://uk.docs.yahoo.com/mail/winter07
.html





      ___________________________________________________________
Yahoo! Answers - Got a question? Someone out there knows the answer. Try
it
now.
http://uk.answers.yahoo.com/ 

------------------------------------------------------------------------
-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
Liquibase-devel mailing list
Liquibase-devel@...
https://lists.sourceforge.net/lists/listinfo/liquibase-devel

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
Liquibase-devel mailing list
Liquibase-devel@...
https://lists.sourceforge.net/lists/listinfo/liquibase-devel