Common Bugs

View: New views
13 Messages — Rating Filter:   Alert me  

Common Bugs

by Benjamin Bentmann :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Dear developers,

After some months of reporting issues for Maven, I have noticed common bugs
in various plugins. For me it seems that many developers are either
completely unaware of the problems (i.e. do not handle them at all) or are
misunderstanding them (i.e. handle them insufficiently). Therefore, I would
like to illustrate the source of those problems in more detail such that
developers can start to prevent bugs right from the beginning rather than
fixing them afterwards.

1) Relative Paths

It is common practice for users of Maven to specify relative paths in the
POM, not to mention that the Super POM does so, too. The intention is almost
always to resolve such relative paths against the base directory of the
current project. In other words, the paths
  target/classes
and
  ${basedir}/target/classes
should resolve to the same directory for a given POM.

Unfortunately, the class java.io.File does not resolve relative paths
against the project's base directory. As mentioned in its class javadoc [0],
it resolves relative paths against the current working directory. In plain
English: Unless a Maven component has complete control over the current
working directory, any usage of java.io.File in combination with a relative
path is a bug.

At first glance, one might be tempted to argue that the project base
directory is equal to the current working directory. However, this
assumption is generally not true. Consider the following scenarios:

a) Reactor Builds
When a child module is build during a reactor build, the current working is
usually the base directory of the parent project, not the base directory of
the current module. That is the most common scenario where users are faced
with the bug.

b) Embedded Maven Invocations
Other tools, most prominently IDEs, that run Maven under the hood may have
set the current working directory to their installation folder or whatever
they like.

c) Maven Invocations using the -f switch
While it is surely an uncommon use-case, the user is free to invoke Maven
from an arbitrary working directory by specifying an absolute path like
  mvn -f /home/me/projects/demo/pom.xml

In order to guarantee reliable builds, Maven and its plugins must manually
resolve relative paths against the project's base directory. A simple idiom
like the following should do just fine:

  File file = new File( path );
  if ( !file.isAbsolute() )
  {
    file = new File( project.getBasedir(), file );
  }

Many Maven plugins can get this resolution automatically if they declare
their affected mojo parameters of type java.io.File instead of
java.lang.String. This subtle difference in parameter types will trigger a
feature that seems to be known as "path translation", i.e. Maven itself will
properly resolve relative paths when it pumps the XML configuration into a
mojo.

While it will not cure this problem completely, I suggest to review the
MavenProject API such that it ensures the following invariant: All paths
reported by means of a getter-like method are absolute. This basically
requires to resolve all relative paths that are pushed in by a setter-like
method. MavenProject.addCompileSourceRoot() is just one example for a public
API method that currently blindly accepts a relative path, giving rise to
later failures because a misbehaving component resolves this path not
properly.

2) Resource Bundle Families

Especially reporting plugins employ resource bundles to support
internationalization. One language (usually English) is provided as the
fallback/default language in the base resource bundle. Due to the lookup
strategy performed by ResourceBundle.getBundle() [1], one must always
provide a dedicated resource bundle for this default language, too. This
bundle can be empty because it inherits the strings via the parent chain
from the base bundle, but it must exist.

Take the following example. A report mojo uses a resource bundle family with
the base name "mymojo-report" and erroneously provides only the following
bundles:
- mymojo-report.properties      (base bundle for English)
- mymojo-report_de.properties   (specific bundle for German)
Further assume that the default locale of the current JVM is "de". Now, when
ResourceBundle.getBundle() is called to retrieve the bundle for locale "en",
it will try the following candidate bundles and use the first one found:
- mymojo-report_en.properties   (candidate for requested locale)
- mymojo-report_de.properties   (candidate for default locale)
- mymojo-report.properties      (base bundle, always last)
Because "mymojo-report_en.properties" does not exist, the bundle
"mymojo-report_de.properties" will be chosen instead of the base bundle
which would have provided the requested English translation.

Plugin developers can easily expose this bug when calling
  mvn site -Dlocales=xy,en
where "xy" denotes some other language supported by the plugin. Specifying
"xy" as the first locale will have the Maven Site Plugin change the JVM's
default locale to "xy" which in turn causes the lookup for "en" to fail as
outlined above.

Thanks for your attention,


Benjamin Bentmann


[0] http://java.sun.com/javase/6/docs/api/java/io/File.html
[1]
http://java.sun.com/javase/6/docs/api/java/util/ResourceBundle.html#getBundle(java.lang.String,%20java.util.Locale,%20java.lang.ClassLoader)
[2] http://jira.codehaus.org/browse/MNG-3273


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@...
For additional commands, e-mail: dev-help@...


Re: Common Bugs

by Bugzilla from mkleint@gmail.com :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

+1 on that.

for embedded use in IDEs there's additional pitfalls.

1. use of System.getProperties()
2. java.lang.Runtime's shutdown hooks.

Milos

On Jan 13, 2008 10:21 AM, Benjamin Bentmann <benjamin.bentmann@...> wrote:

> Dear developers,
>
> After some months of reporting issues for Maven, I have noticed common bugs
> in various plugins. For me it seems that many developers are either
> completely unaware of the problems (i.e. do not handle them at all) or are
> misunderstanding them (i.e. handle them insufficiently). Therefore, I would
> like to illustrate the source of those problems in more detail such that
> developers can start to prevent bugs right from the beginning rather than
> fixing them afterwards.
>
> 1) Relative Paths
>
> It is common practice for users of Maven to specify relative paths in the
> POM, not to mention that the Super POM does so, too. The intention is almost
> always to resolve such relative paths against the base directory of the
> current project. In other words, the paths
>   target/classes
> and
>   ${basedir}/target/classes
> should resolve to the same directory for a given POM.
>
> Unfortunately, the class java.io.File does not resolve relative paths
> against the project's base directory. As mentioned in its class javadoc [0],
> it resolves relative paths against the current working directory. In plain
> English: Unless a Maven component has complete control over the current
> working directory, any usage of java.io.File in combination with a relative
> path is a bug.
>
> At first glance, one might be tempted to argue that the project base
> directory is equal to the current working directory. However, this
> assumption is generally not true. Consider the following scenarios:
>
> a) Reactor Builds
> When a child module is build during a reactor build, the current working is
> usually the base directory of the parent project, not the base directory of
> the current module. That is the most common scenario where users are faced
> with the bug.
>
> b) Embedded Maven Invocations
> Other tools, most prominently IDEs, that run Maven under the hood may have
> set the current working directory to their installation folder or whatever
> they like.
>
> c) Maven Invocations using the -f switch
> While it is surely an uncommon use-case, the user is free to invoke Maven
> from an arbitrary working directory by specifying an absolute path like
>   mvn -f /home/me/projects/demo/pom.xml
>
> In order to guarantee reliable builds, Maven and its plugins must manually
> resolve relative paths against the project's base directory. A simple idiom
> like the following should do just fine:
>
>   File file = new File( path );
>   if ( !file.isAbsolute() )
>   {
>     file = new File( project.getBasedir(), file );
>   }
>
> Many Maven plugins can get this resolution automatically if they declare
> their affected mojo parameters of type java.io.File instead of
> java.lang.String. This subtle difference in parameter types will trigger a
> feature that seems to be known as "path translation", i.e. Maven itself will
> properly resolve relative paths when it pumps the XML configuration into a
> mojo.
>
> While it will not cure this problem completely, I suggest to review the
> MavenProject API such that it ensures the following invariant: All paths
> reported by means of a getter-like method are absolute. This basically
> requires to resolve all relative paths that are pushed in by a setter-like
> method. MavenProject.addCompileSourceRoot() is just one example for a public
> API method that currently blindly accepts a relative path, giving rise to
> later failures because a misbehaving component resolves this path not
> properly.
>
> 2) Resource Bundle Families
>
> Especially reporting plugins employ resource bundles to support
> internationalization. One language (usually English) is provided as the
> fallback/default language in the base resource bundle. Due to the lookup
> strategy performed by ResourceBundle.getBundle() [1], one must always
> provide a dedicated resource bundle for this default language, too. This
> bundle can be empty because it inherits the strings via the parent chain
> from the base bundle, but it must exist.
>
> Take the following example. A report mojo uses a resource bundle family with
> the base name "mymojo-report" and erroneously provides only the following
> bundles:
> - mymojo-report.properties      (base bundle for English)
> - mymojo-report_de.properties   (specific bundle for German)
> Further assume that the default locale of the current JVM is "de". Now, when
> ResourceBundle.getBundle() is called to retrieve the bundle for locale "en",
> it will try the following candidate bundles and use the first one found:
> - mymojo-report_en.properties   (candidate for requested locale)
> - mymojo-report_de.properties   (candidate for default locale)
> - mymojo-report.properties      (base bundle, always last)
> Because "mymojo-report_en.properties" does not exist, the bundle
> "mymojo-report_de.properties" will be chosen instead of the base bundle
> which would have provided the requested English translation.
>
> Plugin developers can easily expose this bug when calling
>   mvn site -Dlocales=xy,en
> where "xy" denotes some other language supported by the plugin. Specifying
> "xy" as the first locale will have the Maven Site Plugin change the JVM's
> default locale to "xy" which in turn causes the lookup for "en" to fail as
> outlined above.
>
> Thanks for your attention,
>
>
> Benjamin Bentmann
>
>
> [0] http://java.sun.com/javase/6/docs/api/java/io/File.html
> [1]
> http://java.sun.com/javase/6/docs/api/java/util/ResourceBundle.html#getBundle(java.lang.String,%20java.util.Locale,%20java.lang.ClassLoader)
> [2] http://jira.codehaus.org/browse/MNG-3273
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@...
> For additional commands, e-mail: dev-help@...
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@...
For additional commands, e-mail: dev-help@...


RE: Common Bugs

by Timothy Reilly-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Just an observation,

but would it not be a good idea to codify these heuristics in a plugin
and add them to the plugins pom.

For example, PMD or some other plugin might generate violations for
these.

 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@...
For additional commands, e-mail: dev-help@...


Re: Common Bugs

by brettporter :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

That sounds like a great idea :)

- Brett

On 15/01/2008, at 12:35 AM, Timothy Reilly wrote:

> Just an observation,
>
> but would it not be a good idea to codify these heuristics in a plugin
> and add them to the plugins pom.
>
> For example, PMD or some other plugin might generate violations for
> these.
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@...
> For additional commands, e-mail: dev-help@...
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@...
For additional commands, e-mail: dev-help@...


Re: Common Bugs

by Benjamin Bentmann :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Greetings,

I remember another subtle issue which I would like to make people aware of.

3) Case-Insensitive String Comparison

When developers need to compare strings without regard to case or want to
realize a map with case-insensitive string keys, they often employ
String.toLowerCase() or String.toUpperCase() to create a "normalized" string
before doing a simple String.equals(). Now, the to*Case() methods are
overloaded: One takes no arguments and one take a Locale object.

The gotcha with the arg-less methods is that their output depends on the
default locale of the JVM but the default locale is out of control of the
developer (see also [0], [1]). That means the string expected by the
developer (who runs/tests his code in a JVM using locale xy) does not
necessarily match the string seen by another user (that runs a JVM with
locale ab). For example, the comparison

  "info".equals(debugLevel.toLowerCase())

is likely to fail for systems with default locale Turkish.

Since developers usually want to compare strings from the English language,
they must use String.to*Case(Locale.ENGLISH) to get reliable results
regardless of the end-user's default locale.

Just to make the picture complete: String.to*Case() is locale-sensitive and
context-aware. In contrast, Character.to*Case() and
String.equalsIgnoreCase() (which relies on Character.to*Case()) are neither
locale-sensitive nor context-aware [2]. For instance

  "ΣΣ".toLowerCase(Locale.ENGLISH).equals("σσ")

is false while

  "ΣΣ".equalsIgnoreCase("σσ")

is true (because the lower case form of "ΣΣ" is "σς").

Regards,


Benjamin Bentmann


[0]
http://java.sun.com/javase/6/docs/api/java/lang/String.html#toLowerCase()
[1] http://cafe.elharo.com/blogroll/turkish/
[2]
http://java.sun.com/javase/6/docs/api/java/lang/Character.html#toLowerCase(char)


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@...
For additional commands, e-mail: dev-help@...


Re: Common Bugs

by Benjamin Bentmann :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi again,

4) Effective Output Directory for Report Plugins

Most reporting plugins will inherit from AbstractMavenReport. In doing so,
they need to implement the inherited but abstract method
getOutputDirectory(). To implement this method, plugins usually declare a
field named "outputDirectory" which they return in the method. Nothing wrong
so far, but please read on.

Some plugins need to create additional files in the report output directory
that accompany the report generated via the sink. While it is tempting to
use either the method getOutputDirectory() or the field outputDirectory
directly in order to setup a path for the output files, this leads most
likely to a bug. More precisely, those plugins will not properly output
files when run by the Maven Site Plugin as part of the site lifecycle. This
is best noticed when the output directory for the site is configured
directly in the Maven Site Plugin such that it deviates from the expression
${project.reporting.outputDirectory} that the plugins use. Multi-language
site generation is another scenario to exploit the bug.

The reason for this: During the site lifecycle, the Maven Site Plugin takes
control of the output directory, that's why all reports (ideally) end up in
the same directory. In order to do, it invokes
MavenReport.setReportOutputDirectory() on each report. Reports are expected
to use the supplied path and not a value from the POM configuration of the
corresponding reporting plugin.

As the take home message: If one needs to know the effective output
directory for the report, MavenReport.getReportOutputDirectory() is your
friend.

And what about AbstractMavenReport.getOutputDirectory()? Well, the
implementation of getReportOutputDirectory() in AbstractMavenReport will
fallback to this method if the Maven Site Plugin did not previously set the
output directory (see also [0]).

Regards,


Benjamin Bentmann


[0] http://jira.codehaus.org/browse/MNG-3369


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@...
For additional commands, e-mail: dev-help@...


Re: Common Bugs

by Benjamin Bentmann :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

5) Reading and Writing Text Files

Textual content is composed of characters while file systems merely store
byte streams. A file encoding (aka charset) is used to convert between bytes
and characters. The challenge is using the right file encoding...

The JVM has this notion of a default encoding ("file.encoding" property)
which it derives from a system's locale or whatever. While this might be a
convenient feature sometimes, using this default encoding for a project
build is in general a bad idea: The build output will depend on the
machine/developer who runs the build. As such, usage of the default encoding
threatens the dream of a reproducible build.

For example, if developer A has UTF-8 as default encoding while developer B
uses ISO-8859-1, text files are very likely to get messed up during resource
filtering or similar tasks.

Therefore, plugin developers should avoid any direct or indirect usage
of the classes FileWriter and FileReader. Instead,
OutputStreamWriter/-Reader should be used with an explicit encoding value.
The required encoding value can be obtained from a configuration parameter
like this:

  /*
   * @parameter default-value="ISO-8859-1"
   */
  private String encoding;

Providing a default value resembles the JVM's concept of a default encoding
with an important difference: This time, the default is specified by the
plugin and hence controlled by the POM. This way, all builds from the same
POM can be guranteed to use the same encoding regardless of the JVM
executing Maven.

Plugins that already provide means to specify the encoding should make sure
they have a default value for this parameter. This is to follow Maven's
philosophy of "convention over configuration" where users should get the
best practice out-of-the-box when a reasonable default can be assumed.

Handling XML files is a little different because these files are equipped
with an encoding declaration. Thanks to Herve Boutemy, plexus-utils provides
a convenient Reader-/WriterFactory for the magic of auto-detecting the
encoding from a byte stream (see also [0]). When writing XML files without
the XmlStreamWriter, be sure to ensure the encoding used for the output
writer matches the encoding specified by the XML declaration being written.
Otherwise later parsing the output might fail.

Regards,


Benjamin Bentmann


[0] http://docs.codehaus.org/display/MAVENUSER/XML+encoding


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@...
For additional commands, e-mail: dev-help@...


Re: Common Bugs

by Benjamin Bentmann :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

6) URLs and Filesystem Paths

URLs and filesystem paths are really two different beasts and converting
between them is not trivial. The main source of problems is that different
encoding rules apply for the strings that make up a URL or filesystem path.

For example, consider the following code snippet:

  File file = new File( "foo bar+foo" );
  URL url = file.toURI().toURL();
  System.out.println( file.toURL() );
  System.out.println( url );
  System.out.println( url.getPath() );
  System.out.println( URLDecoder.decode( url.getPath(), "UTF-8" ) );

which outputs something like

  file:/M:/scratch-pad/foo bar+foo
  file:/M:/scratch-pad/foo%20bar+foo
  /M:/scratch-pad/foo%20bar+foo
  /M:/scratch-pad/foo bar foo

First of all, please note that File.toURL() [1] does not escape the space
character. This yields an invalid URL, as per RFC 2396 [0], section 2.4.3
"Excluded US-ASCII Characters". The class java.net.URL will silently accept
such invalid URLs, in contrast java.net.URI will not (see also URL.toURI()
[2]). For this reason, this API method has already been deprecated and
should be replaced with File.toURI().toURL().

Next, URL.getPath() does in general not return a string that can be used as
a filesystem path. It returns a substring of the URL and as such can contain
escape sequences. The prominent example is the space character which will
show up as "%20". People sometimes hack around this by means of
replace("%20", " ") but that does simply not cover all cases. It's worth to
mention that on the other hand the related method URI.getPath() [3] does
decode escapes but still the result is not a filesystem path (compare the
source for the constructor File(URI)).

To decode a URL, people sometimes also choose java.net.URLDecoder [4]. The
pitfall with this class is that is actually performs HTML form decoding
which is yet another encoding and not the same as the URL encoding (compare
last paragraph in class javadoc about java.net.URL). For instance, a
URLDecoder will errorneously convert the character "+" into a space as
illustrated by the last sysout in the example above.

Code targetting JRE 1.4+ should easily avoid these problems by using

  new File( new URI( url.toString() ) )

when converting a URL to a filesystem path and

  file.toURI().toURL()

when converting back.

Regards,


Benjamin Bentmann


[0] http://www.faqs.org/rfcs/rfc2396.html
[1] http://java.sun.com/javase/6/docs/api/java/io/File.html#toURL()
[2] http://java.sun.com/javase/6/docs/api/java/net/URL.html#toURI()
[3] http://java.sun.com/javase/6/docs/api/java/net/URI.html#getPath()
[4] http://java.sun.com/javase/6/docs/api/java/net/URLDecoder.html 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@...
For additional commands, e-mail: dev-help@...


Re: Common Bugs

by Benjamin Bentmann :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

>  new File( new URI( url.toString() ) )

Correction:
JRE 1.4 is happily returning invalid/unescaped URLs from
ClassLoader.getResource(), making the above suggestion fail with a
URISyntaxException.

The new suggestion is to use FileUtils.toFile(URL) [0] from Commons IO. A
similar methods exists in Plexus Utils but it's currently not decoding
escape sequences.


Benjamin


[0]
http://commons.apache.org/io/api-release/org/apache/commons/io/FileUtils.html#toFile(java.net.URL)


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@...
For additional commands, e-mail: dev-help@...


Re: Common Bugs

by brettporter :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hey Benjamin,

I wonder if it's worth posting these as a series under the developers  
section of the Maven site?

- Brett

On 06/04/2008, at 9:46 PM, Benjamin Bentmann wrote:

>> new File( new URI( url.toString() ) )
>
> Correction:
> JRE 1.4 is happily returning invalid/unescaped URLs from  
> ClassLoader.getResource(), making the above suggestion fail with a  
> URISyntaxException.
>
> The new suggestion is to use FileUtils.toFile(URL) [0] from Commons  
> IO. A similar methods exists in Plexus Utils but it's currently not  
> decoding escape sequences.
>
>
> Benjamin
>
>
> [0] http://commons.apache.org/io/api-release/org/apache/commons/io/FileUtils.html#toFile(java.net.URL)
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@...
> For additional commands, e-mail: dev-help@...
>

--
Brett Porter
brett@...
http://blogs.exist.com/bporter/


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@...
For additional commands, e-mail: dev-help@...


Re: Common Bugs

by baerrach :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Mon, Apr 7, 2008 at 6:21 AM, Brett Porter <brett@...> wrote:
> Hey Benjamin,
>
>  I wonder if it's worth posting these as a series under the developers
> section of the Maven site?

And building custom Checsktyle/PMD rules (if built in ones dont exist)
to flag these as error?

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@...
For additional commands, e-mail: dev-help@...


Re: Common Bugs

by Benjamin Bentmann :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> I wonder if it's worth posting these as a series under the developers
> section of the Maven site?

Vincent and I had already put parts of this stuff onto [0] in a section
named "Some Pitfalls", together with a link to this mail thread. But I
agree, having all of this in a nicely formatted APT doc on the site is a
good idea.

I suggest we move the "Some Pitfalls" section out into a standalone document
such that we can list it on the documentation index. If nobody else goes for
this, it will need to wait some days until I get the next free time slice to
merge and clean it up for proper presentation.


Benjamin


[0] http://maven.apache.org/guides/plugin/guide-java-plugin-development.html


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@...
For additional commands, e-mail: dev-help@...


Re: Common Bugs

by Vincent Siveton :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

2008/4/7, Benjamin Bentmann <benjamin.bentmann@...>:

> > I wonder if it's worth posting these as a series under the developers
> > section of the Maven site?
> >
>
>  Vincent and I had already put parts of this stuff onto [0] in a section
>  named "Some Pitfalls", together with a link to this mail thread. But I
>  agree, having all of this in a nicely formatted APT doc on the site is a
>  good idea.
>
>  I suggest we move the "Some Pitfalls" section out into a standalone
> document such that we can list it on the documentation index. If nobody else
> goes for this, it will need to wait some days until I get the next free time
> slice to merge and clean it up for proper presentation.
>

+1

Vincent

>
>  Benjamin
>
>
>  [0]
> http://maven.apache.org/guides/plugin/guide-java-plugin-development.html
>
>
>
> ---------------------------------------------------------------------
>  To unsubscribe, e-mail: dev-unsubscribe@...
>  For additional commands, e-mail: dev-help@...
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@...
For additional commands, e-mail: dev-help@...