Gathering Artifact repository discovery requirements

View: New views
5 Messages — Rating Filter:   Alert me  

Gathering Artifact repository discovery requirements

by BRIAN FOX-5 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

It's time to start looking at the problems with the current 2.x resolution
scheme as it specifically relates to repository declaration and discovery.
I've created the start of a document at [1]. This should be the place to
gather feedback and use cases that will help drive towards a more complete
benefit and drawback list as well as a list of requirements for a final
solution. Please leave comments on the page with things that should be added
or changed.
Please note, at this point I think it's better to stay away from specific
proposals on implementation details. This tends to pollute peoples thoughts
on the current problems and requirements, so lets stick to just improving
the first 4 sections of the page.

[1]
http://docs.codehaus.org/display/MAVEN/Artifact+resolution+and+repository+discovery

Re: Gathering Artifact repository discovery requirements

by BRIAN FOX-5 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Bump.

On Sun, May 10, 2009 at 9:12 PM, Brian Fox <brianf@...> wrote:

> It's time to start looking at the problems with the current 2.x resolution
> scheme as it specifically relates to repository declaration and discovery.
> I've created the start of a document at [1]. This should be the place to
> gather feedback and use cases that will help drive towards a more complete
> benefit and drawback list as well as a list of requirements for a final
> solution. Please leave comments on the page with things that should be added
> or changed.
> Please note, at this point I think it's better to stay away from specific
> proposals on implementation details. This tends to pollute peoples thoughts
> on the current problems and requirements, so lets stick to just improving
> the first 4 sections of the page.
>
> [1]
> http://docs.codehaus.org/display/MAVEN/Artifact+resolution+and+repository+discovery
>

Re: Gathering Artifact repository discovery requirements

by Wendy Smoak-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sun, May 17, 2009 at 3:15 PM, Brian Fox <brianf@...> wrote:
> Bump.

Maybe it would be better to post the info and discuss here?  I don't
think people are going to keep going back to a wiki page to follow the
comments, and it would be good to have it all in the list archives
anyway.

--
Wendy

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@...
For additional commands, e-mail: dev-help@...


Re: Gathering Artifact repository discovery requirements

by brettporter :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On 18/05/2009, at 12:03 PM, Wendy Smoak wrote:

> On Sun, May 17, 2009 at 3:15 PM, Brian Fox <brianf@...> wrote:
>> Bump.
>
> Maybe it would be better to post the info and discuss here?  I don't
> think people are going to keep going back to a wiki page to follow the
> comments, and it would be good to have it all in the list archives
> anyway.

Yep, that'd help - the only reason I haven't commented is that I  
haven't been online consistently for a couple of weeks. Thanks!

- Brett


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@...
For additional commands, e-mail: dev-help@...


Re: Gathering Artifact repository discovery requirements

by brettporter :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On 11/05/2009, at 11:12 AM, Brian Fox wrote:

> It's time to start looking at the problems with the current 2.x  
> resolution
> scheme as it specifically relates to repository declaration and  
> discovery.

Sorry for the delay in responding to this, I'm still catching up on May.

I think the first few sections are accurate and complete.

For requirements:

> 1. maintain the ability for a user to checkout your code and run mvn  
> install and have it work with no prior setup on their part.


+1

> 2. be able to depend on some jar and not worry about any  
> repositories required for transitive resolution (ie discover the  
> repositories transitively as dependencies are processed) (this is  
> controversial and may be eliminated. First it contributes to the  
> Problem #4 above in that SAT can't be done on a bounded list of  
> repositories. It also doesn't work normally behind a repository  
> manager because the list of repos is usually controlled in the repo  
> manager and thus autodiscovery is intentionally blocked, usually via  
> a mirrorOf * to circumvent the repos maven finds in the poms.)


I think we can achieve this in a way that is compatible with repo  
managers, depending on the solution (see below)

If we have this though, we need to add a new requirement:
5. builds should be able to add their own alternative versions for  
artifacts (eg, see xwiki's build that provides a lot of custom  
versions of standard things), without affecting other builds. So in  
this case, they would use a custom version to ensure within their  
build it can override others and contribute to ranges, but its  
existence in a local repository shouldn't affect other builds.

> 3. be able to separate the dependencies needed by maven plugins from  
> those needed by the build. This means not only where they are  
> resolved from, but also how they are stored locally to prevent cross-
> contamination.

I think I would reword this. I can understand wanting to locate  
plugins separately, and for their repos/deps not to affect the rest of  
the build, but I'm not sure why local storage matters. A dependency  
junit:junit:3.8.1 used in a plugin should be the same as that used in  
a project. Perhaps an alternate/additional requirement is "3. a given  
artifact coordinate must be always use an identical artifact across a  
build".

> 4. Repository identification: at this point we are pretty much in  
> agreement that the URL should be the unique identifier for a  
> repository. People who care about what they are publishing either  
> need to use canonical repositories like Maven central or need to  
> guarantee the existence of the repositories or have decent pointers.  
> In a fully distributed system the relocation mechanism we have does  
> not work in a fully distributed system without a master to manage  
> relocations.


This is a solution, not a requirement :) I think it's clear we need a  
unique identifier. A URI is a good way to do that, but we need to  
accommodate that repositories will move too (This was a problem listed  
earlier). Depending on how we solve the above, it may become less of  
an issue. So perhaps reword as "repositories must be uniquely  
identifiable and able to be relocated to a new location over time  
without affecting existing builds".

I'd then break out artifact relocation as separate requirements:
6. relocating an artifact to a different coordinate must be possible  
even if that is on a different repository

Stemming from the location I'd add:
7. repositories must be able to be mirrored to different locations and  
the user select from their choice of closer, identical repository.

Also, probably implied but worth stating:
8. all discovery must be possible without a repository manager  
installed (though using one can improve the ability to route requests  
differently)

And finally, maybe implied but worth being explicit about:
9. must work for locating parent projects (this will start giving us  
better ways to deal with the chicken/egg problem and auto-versioning)

Turning to solutions since it has been a while now... here's some  
starting points.

I'm tossing around two alternatives in my head:
1) using the repository as the start of the namespace (ie, http://repo1.maven.org/maven2/junit/junit/3.8.1/junit-3.8.1.jar 
  is different to http://repo.otherproject.com/junit/junit/3.8.1/junit-3.8.1.jar)
, where the repository contributes to the "version" of the artifact,  
but is considered the same group/artifact ID for the purpose of  
resolution. Not that this is just for identification, location needs  
to be separate.
2) considering group/artifact ID to be globally unique and repository  
can be derived from that

I'm leaning towards (2) as its shorter notation and easier to  
understand. Under (1), we'd probably need to be able to add the  
repository to a dependency element (perhaps with a shorthand notation  
defined in the pom or its parent

Either way, the resolution mechanism should not be affected by the  
repositories used. For a given set of artifacts, that should always  
resolve the same way. The versions available to a range calculation  
will alter depending on the available repositories, but these should  
all be known up front in the build. I don't think we need to deal with  
how version ranges are calculated / made reproducible here (that's  
being separately dealt with), as long as the above requirements are  
met with respect to the repositories used for it.

To accommodate this, I think the repositories in the POM should become  
constrained to locating metadata for a certain set of artifacts, so  
they can be used to expand reach through resolution, but do not affect  
anything already encountered, and do not affect resolution outside the  
current project. As long as the revised (3) above holds, this will be  
reproducible.

Given 1) , 2), 3), and 5), I think a delegating structure for locating  
an artifact is the way to go. That is, specifying *only* the  
<dependency> element is enough for a build to locate an artifact, and  
always get the same one. The advantages are significant: less  
configuration/easier set up for new repositories, simpler resolution  
logic, faster resolution as it never needs to search multiple  
repositories. The delegation needs to go right down to the version  
level (snapshots in one repo, releases in another). Then the downside  
is loss of control (if we point javax to the download.java.net repos  
automatically, we have to live with that doing dodgy stuff in that  
namespace like bad POMs or changing released artifacts, or just being  
down).

I think this can be overcome by layers of routing rules. So, if  
central becomes the source of pointers to artifacts, then a project  
can add a repository to locate *missing* ones (not override existing)  
as described above, then a user can *alter* routes from their  
settings.xml. A common one for this will be * -> repository manager,  
but you could have others whether you are using a repo manager or not.

As for local storage, which was mentioned in the requirements, I'm  
still in favour or this or similar: http://docs.codehaus.org/display/MAVEN/Local+repository+separation 
. The important part here is that metadata is separated from artifacts  
and local installations are only used when you intended them to be.

Anyway, just a starting point for discussion, if we can agree on some  
of the fundamentals I'm sure we can build up a more complete solution.

Cheers,
Brett



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@...
For additional commands, e-mail: dev-help@...