URIs in place of source filenames in error messages?

View: New views
3 Messages — Rating Filter:   Alert me  

URIs in place of source filenames in error messages?

by Michael(tm) Smith-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

I wanted to ask if the issue of URIs as source file names in
standard GNU-style error messages has ever come up, and what
guidance you might be able to provide on how an application should
deal with URIs (containing colon characters) as source file names.

I also wanted to ask if the "Formatting Error Messages" section of
the GNU Coding Standards might possibly be updated to specify how
error message with URIs as source file names should be formatted.

  http://www.gnu.org/prep/standards/standards.html#Errors

To be more specific: I note that the "Formatting Error Messages"
section of the current GNU Coding Standards specifies the
following basic format for error messages:

  source-file-name:lineno: message

And, for an error message that reports both line and column
numbers, either of the following:

  source-file-name:lineno:column: message
  source-file-name:lineno.column: message

This issue with that format is: What should an application be
expected to do if the "source-file-name" part contains one or more
colon characters -- which it will if it is, say, an HTTP or FTP
URI. For example:

  ftp://www.w3.org/foo/bar.html:5: Error; Attribute "charset" not allowed on element "meta" at this point.

For that particular example, the current spec would seem to
indicate that an application should expect that the
"source-file-name" part of it is "http" and the line number is
"//www.w3.org/foo/bar.html" and the column number is 5 -- or that
the line number is "//www.w3.org/foo/bar" and the column number is
"html" and there's an extra ":5" thrown in.

There's a further issue if the URI contains more than one colon --
which it could if it were a URI for a remote file on an HTTP
server running on a non-standard port; for example:

  http://www.w3.org:8080:5: Error; Attribute "charset" not allowed on element "meta" at this point.

So it seems like it might be beneficial for the GNU Coding
Standards to specify a standard way to indicate that the
source-file-name part of the error message is a URI instead of a
local file. For example:

  If the "source-file-name" part is a URI instead of a local path,
  the error message should use angle brackets to delimit the URI:

    <URI>:lineno: message

So an example error message would look like this:

  <http://www.w3.org:8080/foo/bar.html>:5: Error; Attribute "charset" not allowed on element "meta" at this point.

Applications such as Emacs that have built-in capabilities for
parsing GNU-formatted error message could then be updated to
handle the URI case by recognizing the angle brackets.

As far as the use-case/rationale behind this, consider the case of
applications that may accept (either directly or indirectly) as
input files not just files on a local filesystem but also
files/resources at remote locations -- with such remote locations
being specified by a URI.

By "indirectly" I mean that even in the case where an application
is processing a file on the local filesystem, the file might
reference other files through include/import statements and the
like -- with the possibility that such an import/include statement
might reference a remote resource/file using a URI. This is, for
example, a common occurrence in XSLT stylesheets.

I realize that the "Formatting Error Messages" was originally
intended to define how compilers, specifically, should format
their error messages (and that it in fact starts out by saying
"Error messages from compilers should look like this") and that
compilers traditionally are used to compile source files that are
actually on the same local filesystem, not are remote locations.

But I think the GNU error format is now used across a wide range
of applications -- not just by compilers -- and in particular, by
applications that do need to report errors in remote files (by
giving their URIs). So I think it would be appropriate and
beneficial for the GNU Coding Standards to specify a standard way
of formatting error messages for errors in files are remote URIs.

  --Mike

--
Michael(tm) Smith
http://people.w3.org/mike/
http://sideshowbarker.net/



Re: URIs in place of source filenames in error messages?

by Karl Berry :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Michael,

    I wanted to ask if the issue of URIs as source file names in
    standard GNU-style error messages has ever come up, and what

Not that I can recall.  Thanks for writing all the details and the
suggestion of <...>.

I think the underlying issue isn't actually specific to url's; after
all, regular filenames can contain : characters.  It's never
specifically been addressed.

Just FYI, in practice, what I recall past versions of Emacs (that is,
next-error) doing is, more or less, looking for /^(.*):[0-9]: / and then
the filename is \1, including colons.  (What Emacs actually did is way
more complicated than that, but that was the idea.)  Of course this
loses if the :[0-9]:  pattern happens to match elsewhere in the error
message, but in reality ... (and what emacs 22 does is something
different again, as I understand it).

I'll forward your message to rms and go from there.

Thanks again,
Karl



Re: URIs in place of source filenames in error messages?

by Michael(tm) Smith-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Karl,

> @2008-01-18 18:56 -0600:
> Just FYI, in practice, what I recall past versions of Emacs (that is,
> next-error) doing is, more or less, looking for /^(.*):[0-9]: / and then
> the filename is \1, including colons.  (What Emacs actually did is way
> more complicated than that, but that was the idea.)

Ah, OK -- I should have realized that would be the case (rather
than just doing simple splitting at colons).

> Of course this loses if the :[0-9]:  pattern happens to match
> elsewhere in the error message, but in reality ... (and what
> emacs 22 does is something different again, as I understand it).

OK, I guess I ought to take some time to test Emacs (and Vim and
some other apps that have built-in support for parsing GNU-style
error messages) and see what they actually do if I feed them a URI
for the source-file-name part). And/or look at the code to see
what regexp or other mechanism they use to parse the error messages.

> I'll forward your message to rms and go from there.

Thanks -- I appreciate it

  --Mike

--
Michael(tm) Smith
http://people.w3.org/mike/
http://sideshowbarker.net/


smime.p7s (2K) Download Attachment