Some suggestions for the SOAP api

View: New views
20 Messages — Rating Filter:   Alert me  
< Prev | 1 - 2 | Next >

Some suggestions for the SOAP api

by Karim A. :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Hi W3C folks,

I'd like to thank you for the W3 markup validator API,
and wanted to give you some of my humble suggestions.

1) It would be nice, IMHO, to include in the headers some
information about caching, like "Last-Modified" and "ETag"
headers, in order to reduce the charge on the W3C validator
servers. We won't ask the server to send the xml answer
if it says that it wasn't changed sine a given date.
That would be doable, I think, by looking at the headers
of page being validated.

2) In the SOAP answer, it would be better, IMHO, to have
an XML content of the "m:explanation" tag. Not plain HTML
so the api consumer could use these data as it would.

P.S. Sometimes, when i query the validator with output=soap12,
the server doesn't answer. Which is the case right now.

P.P.S It would be also, wonderful if you publish a list of the
websites that have the validator installed on them, so
that a soap app can query randomly one server among
a list of well known servers and thus reduce the charge on
the W3C server.

Thanks again :)


Karim
--
http://akoncept.com
Innovate Humanum Est


Re: Some suggestions for the SOAP api

by Karim A. :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


And oh yes

What about Gzip compression of the soap output? :)

Karim
--
http://akoncept.com
Innovate Humanum Est


On 9/28/07, Gmail Directeur <directeur@...> wrote:

> Hi W3C folks,
>
> I'd like to thank you for the W3 markup validator API,
> and wanted to give you some of my humble suggestions.
>
> 1) It would be nice, IMHO, to include in the headers some
> information about caching, like "Last-Modified" and "ETag"
> headers, in order to reduce the charge on the W3C validator
> servers. We won't ask the server to send the xml answer
> if it says that it wasn't changed sine a given date.
> That would be doable, I think, by looking at the headers
> of page being validated.
>
> 2) In the SOAP answer, it would be better, IMHO, to have
> an XML content of the "m:explanation" tag. Not plain HTML
> so the api consumer could use these data as it would.
>
> P.S. Sometimes, when i query the validator with output=soap12,
> the server doesn't answer. Which is the case right now.
>
> P.P.S It would be also, wonderful if you publish a list of the
> websites that have the validator installed on them, so
> that a soap app can query randomly one server among
> a list of well known servers and thus reduce the charge on
> the W3C server.
>
> Thanks again :)
>
>
> Karim
> --
> http://akoncept.com
> Innovate Humanum Est
>


Re: Some suggestions for the SOAP api

by Brett Bieber :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


On 9/27/07, Gmail Directeur <directeur@...> wrote:

>
> Hi W3C folks,
>
> I'd like to thank you for the W3 markup validator API,
> and wanted to give you some of my humble suggestions.
>
> 1) It would be nice, IMHO, to include in the headers some
> information about caching, like "Last-Modified" and "ETag"
> headers, in order to reduce the charge on the W3C validator
> servers. We won't ask the server to send the xml answer
> if it says that it wasn't changed sine a given date.
> That would be doable, I think, by looking at the headers
> of page being validated.

I think this is a good idea for a validation system - but may not be
do-able for the public W3 validator. To cache the results would
require an enormous amount of resources given the volume of the
requests the validator gets. Additionally, the validator is intended
to test the HTML served by the server not necessarily the
last-modified headers.

The type of system you're referring to is something I think can be
handled at an individual level by utilizing a local copy of the
validator and a database which caches the results. This is what we've
done for our university's web developers with great results.

>
> 2) In the SOAP answer, it would be better, IMHO, to have
> an XML content of the "m:explanation" tag. Not plain HTML
> so the api consumer could use these data as it would.

This is possible, but given the number of error messages the validator
is capable of displaying it would take a lot of work to duplicate a
text/xml equivalent the error messages with not much benefit. My
thoughts on this were - if you're validating HTML, you should be able
to handle a html snippet response -- if you don't want the full
explanation, you can easily use something like
http://us2.php.net/strip_tags within whatever language you're using to
strip out the html tags and return a plain text error message.

>
> P.S. Sometimes, when i query the validator with output=soap12,
> the server doesn't answer. Which is the case right now.
>
> P.P.S It would be also, wonderful if you publish a list of the
> websites that have the validator installed on them, so
> that a soap app can query randomly one server among
> a list of well known servers and thus reduce the charge on
> the W3C server.

Right now I'm not aware of any public mirrors of the validator
services. This sounds like a good idea for some other members of the
web standards movement to grab onto - provide a mirror of the
validation service(s), but some work would have to be done to document
how mirrors are set up, synchronized with the latest versions etc.

>
> Thanks again :)
>
>
> Karim
> --
> http://akoncept.com
> Innovate Humanum Est
>
>


--
-Brett Bieber

http:saltybeagle.com aim:ianswerq


Parent Message unknown Some suggestions for the SOAP api

by Karim A. :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


> I think this is a good idea for a validation system - but may not be
> do-able for the public W3 validator. To cache the results would
> require an enormous amount of resources given the volume of the
> requests the validator gets. Additionally, the validator is intended
> to test the HTML served by the server not necessarily the
> last-modified headers.

Yes, I see. You're right, but i keep hope that someday
some big companies like google and co will help with
resources.

Here's how I'm seeing this actually:
User submits url "U" to the validator, this one will look at the
caching headers of "U" if it has already a "still usable" copy of
the validation result, it will serve it. Or if the page changed
since its last validation (always depending on the headers)
or never been submitted to validation, the validator will process
this url and cache the result. But again, you're right, since
all this time saving will require storage space...
That good old dilemma of space and time ;-)


> The type of system you're referring to is something I think can be
> handled at an individual level by utilizing a local copy of the
> validator and a database which caches the results. This is what we've
> done for our university's web developers with great results.

Yes, I use this kind of technique for an app that uses the soap
api, by caching locally for 1 minute up to 5 minutes the soap
output of a query just to lighten the harass on the validator server.
You'd ask me to install a copy of the validator... sure I would, but this
will require a dedicated server or at least a VPS which is not
in our budget, we African folks :(

> > 2) In the SOAP answer, it would be better, IMHO, to have
> > an XML content of the "m:explanation" tag. Not plain HTML
> > so the api consumer could use these data as it would.
>
> This is possible, but given the number of error messages the validator
> is capable of displaying it would take a lot of work to duplicate a
> text/xml equivalent the error messages with not much benefit. My
> thoughts on this were - if you're validating HTML, you should be able
> to handle a html snippet response -- if you don't want the full
> explanation, you can easily use something like
> http://us2.php.net/strip_tags within whatever language you're using to
> strip out the html tags and return a plain text error message.

Yes sure, but I'm talking about the "semantic" side of the soap
answer actually. It's IMHO supposed to give us "brut" information
without predefined styling and formatting. For example, the
<p class="helpwanted"> in the m:explanation tag is "superfluous"
since it's up to the response consumer to define its proper way
of formatting (html) and styling (css).

> > P.S. Sometimes, when i query the validator with output=soap12,
> > the server doesn't answer. Which is the case right now.
> >
> > P.P.S It would be also, wonderful if you publish a list of the
> > websites that have the validator installed on them, so
> > that a soap app can query randomly one server among
> > a list of well known servers and thus reduce the charge on
> > the W3C server.
>
> Right now I'm not aware of any public mirrors of the validator
> services. This sounds like a good idea for some other members of the
> web standards movement to grab onto - provide a mirror of the
> validation service(s), but some work would have to be done to document
> how mirrors are set up, synchronized with the latest versions etc.
>

Well, yes, but it's always appreciable to use the computation time
of some jobless servers ;-)
So, I'd suggest to start a personal page containing links to available
validator's copies on the web and I invite you all to send your personal
urls and those of your universities :)

Again thanks and sorry for my poor english :)

Karim
--
http://akoncept.com
Innovate Humanum Est


Parent Message unknown Re: Some suggestions for the SOAP api

by Chris. :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


<quote author="Gmail Directeur">
Yes sure, but I'm talking about the "semantic" side of the soap
answer actually. It's IMHO supposed to give us "brut" information
without predefined styling and formatting. For example, the
<p class="helpwanted"> in the m:explanation tag is "superfluous"
since it's up to the response consumer to define its proper way
of formatting (html) and styling (css).
</quote>

Hello, I'm new to this group and just putting together my first
Validator-SOAP client and I have to say that Gmail D is right here.

In one of my tests, I get an m:explanation that goes like:


<m:explanation>  <![CDATA[
    <p class="helpwanted">
      <a
       
href="http://validator.w3.org/feedback.html?uri=;errmsg_id=64#errormsg"
    title="Suggest improvements on this error message through our
feedback channels"
      >✉</a>
    </p>

    <div class="ve mid-64">
    <p>
      The element named above was found in a context where it is not
allowed.
      This could mean that you have incorrectly nested elements -- such as a
      "style" element in the "body" section instead of inside "head" -- or
      two elements that overlap (which is not allowed).
    </p>
    <p>
      One common cause for this error is the use of XHTML syntax in HTML
      documents. Due to HTML's rules of implicitly closed elements, this
error
      can create cascading effects. For instance, using XHTML's
"self-closing"
      tags for "meta" and "link" in the "head" section of a HTML
document may
      cause the parser to infer the end of the "head" section and the
      beginning of the "body" section (where "link" and "meta" are not
      allowed; hence the reported error).
    </p>
  </div>

                  ]]>
</m:explanation>


This is tough - how do I use the style applied to: <div class="ve
mid-64"> unless I build my own local stylesheet (or inline, etc)?  And
what happens if you guys ever change the class attribute? (answer: my
app breaks).  

Sure I can strip tags but then the whole "Suggest improvements on this
error message" bit gets reduced to a '?'  So now I have to write a
routine to strip out that symbol.  Not too hard but my code is now too
tightly coupled to your choice of message.  If you ever change the
format (say, add link text instead of the '?') my app breaks.

The point of SOAP is to get the data without the formatting --
otherwise, why not skip SOAP altogether and just have us parse the HTML
version to get all our info?

If you guys think it is important to leave the HTML text in the
explanation (for backwards compatibility or whatever), then I'd suggest
solving it with something like:


<m:explanation>
  <![CDATA[
    <p class="helpwanted">
      <a
href="http://validator.w3.org/feedback.html?uri=;errmsg_id=64#errormsg"...</a>
    </p>

    <div class="ve mid-64">
    <p>
      The element named above was found in a context blah blah blah...
    </p>
    <p>
      One common cause for this error is the blah blah blah...
    </p>
    </div>
  ]]>
</m:explanation>

<m:explanationcontent>
  <m:explanationfeedbacklink>
     http://validator.w3.org/feedback.html?uri=;errmsg_id=64#errormsg
  </m:explanationfeedbacklink>

  <m:explanationfeedbacktext>
     Suggest improvements on this error message through our feedback
channels
  </m:explanationfeedbacktext>

  <m:explanationtext>
    The element named above was found in a context blah blah blah...
  </m:explanationtext>

  <m:explanationtext>
    One common cause for this error is the blah blah blah...
  </m:explanationtext>
<m:explanationcontent>


Of course my tag names choices here are irrelevant -- pick whatever you
like.  You could even put my <m:explanationcontent> inside the existing
</m:explanation> along with the <CDATA> if you prefer.  The point is to
separate content and layout.

That way, if I want to create my on helpwanted link in the form of:
    <a href="http://validator.w3.org/feedback..."
class="myFancyClass">Suggest improvements blah blah blah...</a>

or if I want my explanation text in a bulleted list with each paragraph
separated by a <br /> (instead of wrapped in <p>'s inside <div>), I can
do it easily.

No fancy parsing on my end and no code that is too tightly bound to your
choice HTML layout.

I would think that this would be trivial to do on your end - but I've
never seen your code.

-Chris




Re: Some suggestions for the SOAP api

by olivier Thereaux :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Hi Chris,

On Oct 9, 2007, at 07:25 , Chris Parrish wrote:
> This is tough - how do I use the style applied to: <div class="ve  
> mid-64"> unless I build my own local stylesheet (or inline, etc)?

As documented, the <m:explanation> is the same block of HTML used in  
the validator's main output. Unfortunately, that's the way the error  
message explanations are stored, so unless someone comes in and  
wishes to clean that up, it may stay that way a while. Your  
suggestions are quite good, though, but it would take time to:
* take the current error explanations file
* massage that into something more structure
* change the templates of the validator to take that structured data  
and make that HTML/API XML at runtime

If anyone's interested, that would speed things up. If not, let's put  
the suggestion in bugzilla and see when one of the developers may  
have time for it.

> Sure I can strip tags but then the whole "Suggest improvements on  
> this error message" bit gets reduced to a '?'

Sounds odd. Are you sure you treat the incoming data as utf-8?

> The point of SOAP is to get the data without the formatting --  
> otherwise, why not skip SOAP altogether and just have us parse the  
> HTML version to get all our info?

Good point, but the soap output gives you much more info, and well  
organized. For instance, since you have <m:messageid> you could  
download the error message explanations from the code base of the  
validator and map that to whatever format you want to use. Would that  
help?

--
olivier


Re: Some suggestions for the SOAP api

by Chris. :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


olivier Thereaux wrote:

> As documented, the <m:explanation> is the same block of HTML used in
> the validator's main output. Unfortunately, that's the way the error
> message explanations are stored, so unless someone comes in and wishes
> to clean that up, it may stay that way a while. Your suggestions are
> quite good, though, but it would take time to:
> * take the current error explanations file
> * massage that into something more structure
> * change the templates of the validator to take that structured data
> and make that HTML/API XML at runtime
>
> If anyone's interested, that would speed things up. If not, let's put
> the suggestion in bugzilla and see when one of the developers may have
> time for it.
>

Bummer.  Well, as a compromise, are the explanation messages (in HTML)
separate from the helpwanted links?  It would at least help if the
current <m:explanation> could be served up via SOAP as:

<m:explanationfeedback>
  <![CDATA[
   <p class="helpwanted">
     <a

href="http://validator.w3.org/feedback.html?uri=;errmsg_id=64#errormsg"
   title="Suggest improvements on this error message through our
feedback channels"
     >✉</a>
   </p>
]]>
</m:explanationfeedback>

and

<m:explanationtext>
  <![CDATA[
   <div class="ve mid-64">
   <p>
     The element named above was found in a context where it is not
     allowed. This could mean that you have incorrectly nested elements
     -- such as a "style" element in the "body" section instead of
     inside "head" -- or two elements that overlap (which is not
     allowed).
   </p>
   <p>
     One common cause for this error is the use of XHTML syntax in HTML
     documents. Due to HTML's rules of implicitly closed elements, this
     error can create cascading effects. For instance, using XHTML's
     "self-closing" tags for "meta" and "link" in the "head" section of
     a HTML document may cause the parser to infer the end of the "head"
     section and the beginning of the "body" section (where "link" and
     "meta" are not allowed; hence the reported error).
   </p>
   </div>
]]>
</m:explanationtext>

Is this possible (simply)?


Now, with regards to massaging the current explanation files, I hadn't
noticed that some of the message paragraphs, are filled with <a>, <code>
and other tags.  So, I'm not sure how you'd convert the messages to a
text-only format unless you kept a separate version of the error for the
SOAP and standard validator.  You certainly couldn't keep a text-only
version and reconstitute the HTML from it.

I'm guessing  a better solution might be to keep the existing messages
in the current format and, instead, code a processor for SOAP users that
parses them into plain text (or at least more distinct blocks of HTML)
much like app developers now must do.  That way, the fancy parsing rules
are coupled to the messages themselves in the same app -- yours.  Sure,
changes to the message format would require adjusting the SOAP
processing code on the validator -- but it wouldn't require every SOAP
consumer to re-work their apps.  And you would have knowledge of when/if
those changes were going to happen.

>> Sure I can strip tags but then the whole "Suggest improvements on
>> this error message" bit gets reduced to a '?'
>
> Sounds odd. Are you sure you treat the incoming data as utf-8?

Yeah it's odd.  To be accurate, what is left is '✉' which is the
content of the <a> tag.  So, what I said is correct.


-Chris




Parent Message unknown Re: Some suggestions for the SOAP api

by Karim A. :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Bonjour Olivier, Hi Chris,

Let me add my 2 cts to the discussion I started.

Here's how I see this:

<m:explanation>
       <m:feedback>
        http://validator.w3.org/feedback.html?
         uri=http%3A%2F%2Fwww.yahoo.com%2F;errmsg_id=76#errormsg
      <m:feedback>

      <m:explanation-content mid="334">
       <![CDATA[
      <p>
      You have used the element named above in your document, but the
      document type you are using does not define an element of that name.
      This error is often caused by:
     </p>
     <ul>
      <li>incorrect use of the "Strict" document type with a document that
      uses frames (e.g. you must use the "Frameset" document type to get
      the "<frameset>" element),</li>
      <li>by using vendor proprietary extensions such as "<spacer>"
      or "<marquee>" (this is usually fixed by using CSS to achieve
      the desired effect instead).</li>
      <li>by using upper-case tags in XHTML (in XHTML attributes and elements
      must be all lower-case).</li>
     </ul>
      ]]>
     </m:explanation-content>
</m:explanation>

i.e. use two children <m:feedback> and <m:explanation-content>
Instead of the <p class="helpwanted"> and <div class="ve mid-344">

This, IMHO, is better for a SOAP consumer, since it avoids
all pre-formating. and it separates "semantically" two things:
the real explanation and the feedback link.

Sure, we can provide a sorte of a hack to check.cgi using
some regex to create this result but it will still be a hack
and wont be a really maintainable and nice solution.

I'd really give a hand but that's a long time ago that
I didn't made anything with perl and my skills go weaker
each day more :(

Olivier, so if I understand well, there's a bijective
relation between the message given in the m:explanation
and m:messageid?
And let me guess... is the feedback link always a function
of the uri?
So, can we just regenerate the m:explanation element using
a database of messages and a feedback link using the uri.
Am I right? wont this change in future releases of the validator?


Karim
--
http://akoncept.com
Innovate Humanum Est


On 10/9/07, olivier Thereaux <ot@...> wrote:

>
> Hi Chris,
>
> On Oct 9, 2007, at 07:25 , Chris Parrish wrote:
> > This is tough - how do I use the style applied to: <div class="ve
> > mid-64"> unless I build my own local stylesheet (or inline, etc)?
>
> As documented, the <m:explanation> is the same block of HTML used in
> the validator's main output. Unfortunately, that's the way the error
> message explanations are stored, so unless someone comes in and
> wishes to clean that up, it may stay that way a while. Your
> suggestions are quite good, though, but it would take time to:
> * take the current error explanations file
> * massage that into something more structure
> * change the templates of the validator to take that structured data
> and make that HTML/API XML at runtime
>
> If anyone's interested, that would speed things up. If not, let's put
> the suggestion in bugzilla and see when one of the developers may
> have time for it.
>
> > Sure I can strip tags but then the whole "Suggest improvements on
> > this error message" bit gets reduced to a '?'
>
> Sounds odd. Are you sure you treat the incoming data as utf-8?
>
> > The point of SOAP is to get the data without the formatting --
> > otherwise, why not skip SOAP altogether and just have us parse the
> > HTML version to get all our info?
>
> Good point, but the soap output gives you much more info, and well
> organized. For instance, since you have <m:messageid> you could
> download the error message explanations from the code base of the
> validator and map that to whatever format you want to use. Would that
> help?
>
> --
> olivier
>
>


Re: Some suggestions for the SOAP api

by Chris. :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Gmail Directeur wrote:
i.e. use two children <m:feedback> and <m:explanation-content>
Instead of the <p class="helpwanted"> and <div class="ve mid-344">

This, IMHO, is better for a SOAP consumer, since it avoids
all pre-formating. and it separates "semantically" two things:
the real explanation and the feedback link.
Sounds like we're on the same page here -- looks like we both posted the same solution.  

Now that I've thought about it more, here's my refined variation on this same theme:

<m:markupvalidationresponse ...
    ...
    <m:errors>
        <m:errorcount>1</m:errorcount>
        <m:errorlist>
            <m:error>
                ...
                <m:explanation>
                   CDATA as currently (for backwards compatibility)
                </m:explanation>

                <m:explanationfeebacklink>
                   CDATA with <p class="helpwanted">...</p>
                </m:explanationfeebacklink>

                <m:explanationcontenthtml>
                   CDATA with <div class="ve mid-344">...</div>
                </m:explanationfeebackhtml>

                <m:explanationfeebackuri>
                   http://validator.w3.org/feedback.html?uri=;
                   errmsg_id=344#errormsg
                </m:explanationfeebackurl>

                <m:explanationcontenttext>
                   <m:explanationparagraphtext>
                      Plain text paragraph (probably fancy tag strip
                      and parse of m:explanationcontenthtml)
                   </m:explanationparagraphtext>

                   <m:explanationparagraphtext>
                      2nd plain text paragraph (probably fancy tag
                      strip and parse of m:explanationcontenthtml)
                   <m:explanationparagraphtext>

                </m:explanationfeebackhtml>
            ...

It's a bit elaborate and it does increase the size of the response (though nowhere near the file size of the HTML output version).  But this way:
  * Backwards compatibility is ensured (if you think that matters --
     otherwise, deprecate this one).
  * Raw text messages and the link url for helpwanted are available
     for those that just want raw data.
  * HTML versions of just the helpwanted and content are separately
     available for those that want to keep all the markup within the
     messages.


-Chris

Re: Some suggestions for the SOAP api

by Karim A. :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Hi Chris,

Yes, we're on the same wave length (literally translated from french ;-) )
Well, almost actually, since I'm not really ok for the
<m:explanationparagraphtext> elementS for they're superfluous imho.

I mean the explanation (in it's html format) is ONE entity by itself
and I'm sure, the W3C will make its possible to make that
html data semantically and "syntaxically" correct, wont you? ;-)
So once we have data (that's formated in html ok, but well formated)
this can be processed.

I mean there's no harm in having html data, it's after all a format
more elaborated than raw text (since it contains some semantics :
paragraphes and lists...) but it's as elaborated as are text data to
ascii bytes representation ;-)

Another thing:

<m:explanationcontenthtml>
   CDATA with <div class="ve mid-344">...</div>
</m:explanationfeebackhtml>

I don't think that mentionning <div class=...> in this element
is that required, I could and would use my own classes and
even something other than a "div".
If the mid-344 word (phrase or something) has a "meaning",
a semantic value, then i'd suggest to make it an attribute
of the <m:explanationcontenthtml> element.
So this is my suggestion for this:

<m:explanationcontenthtml mid="344">
   CDATA **without** <div class="ve mid-344">...</div>
</m:explanationfeebackhtml>

Your idea of keeping the backward compatibility is nice too,
but I confess I've stumbled during my tests upon some urls
which have... more than 1000 erros!! no, really! more than that!

Imagine the content duplication, the size of the served response :(

So, I'd happily close my eyes about compatibility, won't you? :)

Anyone wants to keep old represenation? anyone?
See? no one wants it, lets take it off! ;-)

Thanks for your suggestions and ideas!

Cheers.

Karim
--
http://akoncept.com
Innovate Humanum Est


On 10/11/07, Chris. <chris.forummail@...> wrote:

>
>
>
> Gmail Directeur wrote:
> >
> > i.e. use two children <m:feedback> and <m:explanation-content>
> > Instead of the <p class="helpwanted"> and <div class="ve mid-344">
> >
> > This, IMHO, is better for a SOAP consumer, since it avoids
> > all pre-formating. and it separates "semantically" two things:
> > the real explanation and the feedback link.
> >
>
> Sounds like we're on the same page here -- looks like we both posted the
> same solution.
>
> Now that I've thought about it more, here's my refined variation on this
> same theme:
>
> <m:markupvalidationresponse ...
>     ...
>     <m:errors>
>         <m:errorcount>1</m:errorcount>
>         <m:errorlist>
>             <m:error>
>                 ...
>                 <m:explanation>
>                    CDATA as currently (for backwards compatibility)
>                 </m:explanation>
>
>                 <m:explanationfeebacklink>
>                    CDATA with <p class="helpwanted">...</p>
>                 </m:explanationfeebacklink>
>
>                 <m:explanationcontenthtml>
>                    CDATA with <div class="ve mid-344">...</div>
>                 </m:explanationfeebackhtml>
>
>                 <m:explanationfeebackuri>
>                    http://validator.w3.org/feedback.html?uri=;
>                    errmsg_id=344#errormsg
>                 </m:explanationfeebackurl>
>
>                 <m:explanationcontenttext>
>                    <m:explanationparagraphtext>
>                       Plain text paragraph (probably fancy tag strip
>                       and parse of m:explanationcontenthtml)
>                    </m:explanationparagraphtext>
>
>                    <m:explanationparagraphtext>
>                       2nd plain text paragraph (probably fancy tag
>                       strip and parse of m:explanationcontenthtml)
>                    <m:explanationparagraphtext>
>
>                 </m:explanationfeebackhtml>
>             ...
>
> It's a bit elaborate and it does increase the size of the response (though
> nowhere near the file size of the HTML output version).  But this way:
>   * Backwards compatibility is ensured (if you think that matters --
>      otherwise, deprecate this one).
>   * Raw text messages and the link url for helpwanted are available
>      for those that just want raw data.
>   * HTML versions of just the helpwanted and content are separately
>      available for those that want to keep all the markup within the
>      messages.
>
>
> -Chris
>
>
> --
> View this message in context: http://www.nabble.com/Some-suggestions-for-the-SOAP-api-tf4532107.html#a13149338
> Sent from the w3.org - www-validator mailing list archive at Nabble.com.
>
>
>


Re: Some suggestions for the SOAP api

by Chris. :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Karim wrote:
Well, almost actually, since I'm not really ok for the
<m:explanationparagraphtext> elementS for they're superfluous imho.

I mean the explanation (in it's html format) is ONE entity by itself
and I'm sure, the W3C will make its possible to make that
html data semantically and "syntaxically" correct, wont you? ;-)
So once we have data (that's formated in html ok, but well formated)
this can be processed.

I mean there's no harm in having html data, it's after all a format
more elaborated than raw text (since it contains some semantics :
paragraphes and lists...) but it's as elaborated as are text data to
ascii bytes representation ;-)
Yes, it is repetitive, and in many cases (including mine) one could use the html version just fine.  However, there is a danger in parsing and reformatting their HTML -- what if they change the messages and their markup?  If your app was critical enough, you'd have a problem and maybe not even know it.  

I don't like parsing rules in my app tied to message formats on theirs.  That's why I say it's safest for them to offer plain text.

Besides, who better to know the parsing rules than the writers of the messages?  And it gets done once on the validator side for everyone instead of 100 coders reinventing the wheel on the client side.

It just seem like the 'right' way to do it.  That said, you could probably talk me out of it and I could live without it.


Karim wrote:
Another thing:

<m:explanationcontenthtml>
   CDATA with <div class="ve mid-344">...</div>
</m:explanationfeebackhtml>

I don't think that mentionning <div class=...> in this element
is that required, I could and would use my own classes and
even something other than a "div".
If the mid-344 word (phrase or something) has a "meaning",
a semantic value, then i'd suggest to make it an attribute
of the <m:explanationcontenthtml> element.
So this is my suggestion for this:

<m:explanationcontenthtml mid="344">
   CDATA **without** <div class="ve mid-344">...</div>
</m:explanationfeebackhtml>
Totally agree.  I just used the <div>...</div> notation to illustrate which content I was referring to.  I too would strip off the container <div>'s (though I imagine they'd have to do just that - strip them off).

Frankly, I'd do the same with the <p> tags wrapping the html in the helpwanted section.

... or maybe the helpwanted stuff would be better served as the url string, and title string:

<m:feedbackurl>
  http://validator.w3.org...
</m:feedbackurl>

and

<m:feedbacktext>
  Suggest improvements on this...
</m:feedbacktext>

Yeah, I like that better.  No html format for this item. We can build our own html links without the goofy '?' and their css styles.


Karim wrote:
Your idea of keeping the backward compatibility is nice too,
but I confess I've stumbled during my tests upon some urls
which have... more than 1000 erros!! no, really! more than that!

Imagine the content duplication, the size of the served response :(

So, I'd happily close my eyes about compatibility, won't you? :)
First of all, shoot whoever's writing your code.

But I wouldn't cry if backwards compatibility wasn't maintained either -- hey, they refer to this interface as 'experimental'

Of course the best of both worlds could be achieved by calling this spec SOAP version 1.3 and leaving the 1.2 engine in the code as is.


Karim wrote:
Anyone wants to keep old represenation? anyone?
See? no one wants it, lets take it off! ;-)
Sorry people, I don't know this guy ;)

-Chris

Re: Some suggestions for the SOAP api

by Karim A. :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Chris wrote:

> Yes, it is repetitive, and in many cases (including mine) one could use the
> html version just fine.  However, there is a danger in parsing and
> reformatting their HTML -- what if they change the messages and their
> markup?  If your app was critical enough, you'd have a problem and maybe not
> even know it.
>
> I don't like parsing rules in my app tied to message formats on theirs.
> That's why I say it's safest for them to offer plain text.

Sure, the parsing rules should be outside the code, that's
why I prefer xslt.

But if we follow the idea of text elements, we'll loose the
"semantic" information carried with the html data that the
current soap12 api provides.

Example:
<div class="ve mid-76">
    <p>
      You have used the element named above in your document, but the
      document type you are using does not define an element of that name.
      This error is often caused by:
    </p>
    <ul>
      <li>incorrect use of the "Strict" document type with a document that
      uses frames (e.g. you must use the "Frameset" document type to get
      the "<frameset>" element),</li>
      <li>by using vendor proprietary extensions such as "<spacer>"
      or "<marquee>" (this is usually fixed by using CSS to achieve
      the desired effect instead).</li>
      <li>by using upper-case tags in XHTML (in XHTML attributes and elements
      must be all lower-case).</li>
    </ul>
  </div>

Inside this <div class="ve mid-76"> which is extracted from the
m:explanation of a soap response, we have <p> and <ul>
and they both have a different "meaning", one is a paragraph
the other is an ordered list.

If we you use <m:explanationcontenttext> as you suggested,
we'll loose this "difference" between text parts of the explanation.

So, what to do? create different <m:explanationcontenttext> ?
Surely not! We'll reinvent HTML then.

The point is that I think that these explanations are little
and I'm sure they're valid (I mean by valid: it's XML and
each element (p, li, ol, ul, dt, dd, dl) have thier right meanings)

I just guess, and hope I'm right: m:explanation text is
nothing but a textual/readable content intentded for "human" use.

Isn't it?

Karim
--
http://akoncept.com
Innovate Humanum Est


Re: Some suggestions for the SOAP api

by Chris. :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Karim A. wrote:
Example:
<div class="ve mid-76">
    <p>
      You have used the element named above in your document, but the
      document type you are using does not define an element of that name.
      This error is often caused by:
    </p>
    <ul>
      <li>incorrect use of the "Strict" document type with a document that
      uses frames (e.g. you must use the "Frameset" document type to get
      the "<frameset>" element),</li>
      <li>by using vendor proprietary extensions such as "<spacer>"
      or "<marquee>" (this is usually fixed by using CSS to achieve
      the desired effect instead).</li>
      <li>by using upper-case tags in XHTML (in XHTML attributes and elements
      must be all lower-case).</li>
    </ul>
  </div>

Inside this <div class="ve mid-76"> which is extracted from the
m:explanation of a soap response, we have <p> and <ul>
and they both have a different "meaning", one is a paragraph
the other is an ordered list.
Ahh, I hadn't found an error message with a bulleted list (guess I need pages with more errors like yours  ).

Seeing that, I agree -- nix the plain text.


And while we're on the New, Improved SOAP 1.3 interface, could we also look into:

1.)  Adding the general warning text.  For instance:

    <m:warnings>
        <m:warningcount>2</m:warningcount>
        <m:warninginfo>
            <![CDATA[
                 The following missing or conflicting information caused the validator to perform guesswork prior to validation. If the guess or fallback is incorrect, it may make validation results entirely incoherent. It is <em>highly recommended</em> to check these potential issues, and, if necessary, fix them and re-validate the document.
            ]]>
        </m:warninginfo>
        <m:warninglist>
            ...

In fact, I'd like any content like this one -- they are tied to the operation of the validator and make the rest of the output more useable/understandable.


2.)  In a recent test, I had a document with 2 errors in the SOAP output (catching that number Kamir ). But in the HTML output there were 2 Errors AND an Info message all together in the Errors section of the output.

I'm not sure how often these Info messages appear nor how useful they are.  But, if the Validator thinks they're helpful in one interface, why not the SOAP interface too?

-Chris

Re: Some suggestions for the SOAP api

by Karim A. :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Chris. wrote:

> 1.)  Adding the general warning text.  For instance:
>
>     <m:warnings>
>         <m:warningcount>2</m:warningcount>
>         <m:warninginfo>
>             <![CDATA[
>                  The following missing or conflicting information caused the
> validator to perform guesswork prior to validation. If the guess or fallback
> is incorrect, it may make validation results entirely incoherent. It is
> <em>highly recommended</em> to check these potential issues, and, if
> necessary, fix them and re-validate the document.
>             ]]>
>         </m:warninginfo>
>         <m:warninglist>
>             ...

Yes I agree, that would be nice and provide a kind
of "prologue" to the warnings list.

> 2.)  In a recent test, I had a document with 2 errors in the SOAP output
> (catching that number Kamir :-P). But in the HTML output there were 2 Errors
> AND an Info message all together in the Errors section of the output.
>
> I'm not sure how often these Info messages appear nor how useful they are.
> But, if the Validator thinks they're helpful in one interface, why not the
> SOAP interface too?

Definitely agree! The SOAP response, to paraphrase Olivier, is
supposed to give at least if not more than the same info
that the validator returns in its default output (html)

Would you please give us this URL?

I'm testing several urls for my still-in alpha stage app
and I'd like to see and treat such cases too :)
And oh, yes, I've found several funny stuff:
for eg. zeldman's website wasn't valid for a long time and
the W3C home page still has 53 CSS warnings ;-)


Karim
--
http://akoncept.com
Innovate Humanum Est


Re: Some suggestions for the SOAP api

by olivier Thereaux :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Hi Chris, all.

On Oct 11, 2007, at 09:00 , Chris Parrish wrote:
> Bummer.  Well, as a compromise, are the explanation messages (in HTML)
> separate from the helpwanted links?

Yes, they are, the helpwanted could indeed be removed altogether from  
soap output.

Regarding your suggestions for the output, things like:
>   <m:explanationparagraphtext>
... sound a bit overkill. What's the gain between this and a <p>?

As you write elsewhere:
> Yes, it is repetitive, and in many cases (including mine) one could  
> use the
> html version just fine.
...
>  what if they change the messages and their markup?  If your app  
> was critical enough, you'd have a problem and maybe not even know it.

If the soap markup is generated from the message, and the messages  
change, the soap output would change, too.

--
olivier






Re: Some suggestions for the SOAP api

by Chris. :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

olivier Thereaux wrote:
> On Oct 11, 2007, at 09:00 , Chris Parrish wrote:
>> Bummer.  Well, as a compromise, are the explanation messages (in HTML)
>> separate from the helpwanted links?
>
> Yes, they are, the helpwanted could indeed be removed altogether from soap output.

Yeah, I'd like to remove it from the explanation (although I'm fine to move it elsewhere rather than just kill it -- as you'll see in my following post).

>>  what if they change the messages and their markup?  If your app was critical enough, you'd have a problem and maybe not even know it.
>
> If the soap markup is generated from the message, and the messages change, the soap output would change, too.
>

The point here Oliver, is that if you deliver me something in the form of:

  <p>
    Explanation element 1 text
  </p>
  <p>
    Explanation element 2 text
  </p>

And I write a routine to parse the <p>'s and then you guys change the explanation messages to the format of:

  <div>
    <p>
      <ul>
        <li>Explanation element 1 text</li>
        <li>Explanation element 2 text</li>
      </ul>
    </p>
  </div>

Then my parsing code now would no longer work.  Worse still, I might not know that my app is broken -- it might just be spitting out garbled output to users.

However, Karim drew my attention to the fact that there is a bunch of necessary markup inside your explanations (like internal bullet lists, emphasis, even links) that I don't want to loose.  So, I'm OK with dropping the request for plain-text only explanations.

That said, the goal should be to reduce the current explanation text to as much of their core elements as possible.  For instance, the current text:

    <div class="ve mid-344">
    <p>
        The checked page did not contain a document type ("DOCTYPE") declaration.
        The Validator has tried to validate with a fallback DTD,
        but this is quite likely to be incorrect and will generate a large number
        of incorrect error messages. It is highly recommended that you insert the
        proper DOCTYPE declaration in your document -- instructions for doing this
        are given above -- and it is necessary to have this declaration before the
  page can be declared to be valid.
    </p>
  </div>

should probably have the <div>'s stripped off.  
  * That way, I can stick the explanation in my own structure and apply my own class and styles.
  * The semantic grouping of all the paragraphs together via <div>'s is redundant -- the <m:explanation> already covers that.
  * Future changes to the message structure are less likely to cause any issues with my code since I'm only dealing with the essentials of the explanation and not formatting elements (in the message-change example above, you'd change the SOAP output to only include the <ul>'s and their content).

I'll post a follow-up message to this one outlining my suggestions for revising the output (that way I can clarify what I'm saying here and address some other issues I've found).

-Chris





Re: Some suggestions for the SOAP api

by Chris. :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Ok, so after all this discussion, I have some recommendations for significantly improving the output of your SOAP engine.  The improvements include:

1.)  Current error <m:explanation> text includes the core explanation bundled with other "stuff" -- this should be separated into its core elements.

2.)  The <m:warning> elements don't include any <m:explanation> (even though the web API uses this).  This should be added

3.)  The web API includes a summary message for all warnings: "The following missing or conflicting information caused the validator to perform guesswork prior to validation. If the guess or fallback is incorrect, it may make validation results entirely incoherent. It is highly recommended to check these potential issues, and, if necessary, fix them and re-validate the document."  I have no idea whether there is only one version of this message but it should be included in the SOAP output.  It gives context to the warnings and if the w3 sees fit to revise this message with future validator versions, I want the revised text.

4.)  The web API includes 'info' elements.  So far, I have noticed them woven in with the errors but I "think" I might have seen one once in the warnings section.  I am not including these in my example because I don't know if they are their own category or just a "type" of error and warning (Oliver -- I could use some help here).  Anyway, if they're useful to web API users, I want 'em too.

So, excluding #4 above -- until I know more, here's what I recommend:

  <m:errors>
    <m:errorcount>2</m:errorcount>
    <m:errorlist>
      <m:error>
        <m:line>1</m:line>
        <m:col>0</m:col>
        <m:message>no document type declaration; implying "<!DOCTYPE HTML SYSTEM>"</m:message>
        <m:messageid>344</m:messageid>
+       <m:feedbackurl>http://validator.w3.org/feedback.html?uri=;errmsg_id=344#errormsg</m:feedbackurl>
+       <m:feedbacktext>Suggest improvements on this error message through our feedback channels</m:feedbacktext>
        <m:explanation>
          <![CDATA[
~           <p>
~             The checked page did not contain a document type ("DOCTYPE") declaration.
~             The Validator has tried to validate with a fallback DTD,
~             but this is quite likely to be incorrect and will generate a large number
~             of incorrect error messages. It is highly recommended that you insert the
~             proper DOCTYPE declaration in your document -- instructions for doing this
~             are given above -- and it is necessary to have this declaration before the
~             page can be declared to be valid.
~           </p>
          ]]>
        </m:explanation>
        <m:source><![CDATA[<strong title="Position where error was detected."><</strong>html>]]></m:source>
      </m:error>
 
        ...
 
    </m:errorlist>
  </m:errors>
 
  <m:warnings>
    <m:warningcount>2</m:warningcount>
+   <m:warningoverview>
+     <![CDATA[
+       <p>
+         The following missing or conflicting information caused the validator to perform
+         guesswork prior to validation. If the guess or fallback is incorrect, it may make
+         validation results entirely incoherent. It is <em>highly recommended</em> to
+         check these potential issues, and, if necessary, fix them and re-validate the
+         document.
+       </p>
+     ]]>
+   </m:warningoverview>
    <m:warninglist>
      <m:warning>
        <m:messageid>W06</m:messageid>
        <m:message>Unable to Determine Parse Mode!</m:message>
+       <m:explanation>
+         <![CDATA[
+           <p>The validator can process documents either as XML (for document types such as XHTML, SVG, etc.) or SGML (for HTML 4.01 and prior versions). For this document, the information available was not sufficient to determine the parsing mode unambiguously, because:</p>
+           <ul>
+             <li>in <em>Direct Input</em> mode, no MIME Media Type is served to the validator</li>
+             <li>No known Document Type could be detected</li>
+             <li>No XML declaration (<abbr>e.g</abbr> <code><?xml version="1.0"?></code>) could be found at the beginning of the document.</li>
+           </ul>
+           <p>As a default, the validator is falling back to SGML mode.</p>
+         ]]>
+       </m:explanation>
      </m:warning>
 
        ...
 
    </m:warninglist>
  </m:warnings>

Lines marked with (+) are new, lines marked with (~) are changed.  The remaining question is: Where do the info notices go?

-Chris

Re: Some suggestions for the SOAP api

by Chris. :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Karim A. wrote:
Definitely agree! The SOAP response, to paraphrase Olivier, is
supposed to give at least if not more than the same info
that the validator returns in its default output (html)

Would you please give us this URL?
I wasn't testing a live page but sending a "fragment."  The markup being tested was:

  <html>
    <head>
      <title>With no end tag
    </head>
    <body>
      <h1>My H1</h1>
      <p>And, finally, a paragraph</p>
    </body>
  </html>


The output in the web API includes:

  <li class="msg_info">
    Info
    <em>Line 3, Column 4</em>:
    start tag was here.<pre><code class="input">    <strong title="Position where error was detected."><</strong>title>With no end tag</code></pre>
  </li>


This info/notice doesn't show on the SOAP output but is helpful.  My question is whether this is part of the previous error, it's own error (perhaps with an "info" attribute) or something else altogether.

I also could have sworn that I saw one of these grouped in with the messages once -- but I have no idea what markup I used to generate that.

-Chris

Re: Some suggestions for the SOAP api

by Karim A. :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Bonjour Olivier, Chris, everyone!


On 10/15/07, olivier Thereaux <ot@...> wrote:
> Yes, they are, the helpwanted could indeed be removed altogether from
> soap output.

I don't actually agree with this Olivier, this link has its
purpose and meaning  after all.

If we remove it, we lose the chance to have
more -maybe constructive- feedback from users outside
the W3C validator.
In fact, apps may offer the possibility to help back the W3C.
But IMHO the soap api should offer more flexibility to the
app developer in order to be able to represent, formulate and
design his feedback links ans sub-systems the way he
want it, without any formal or style impeachments

> Regarding your suggestions for the output, things like:
> >   <m:explanationparagraphtext>
> ... sound a bit overkill. What's the gain between this and a <p>?

Again, I strongly believe that an HTML content isn't just text wrapped
with tags, and the html explanation the way it is given now has two
sides: the content and its inner semantics.
So, I agree, HTML explanation is richer hence better than flat text.


Karim
--
http://akoncept.com
Innovate Humanum Est


Re: Some suggestions for the SOAP api

by Chris. :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Karim A. wrote:
On 10/15/07, olivier Thereaux <ot@w3.org> wrote:
> Yes, they are, the helpwanted could indeed be removed altogether from
> soap output.

I don't actually agree with this Olivier, this link has its
purpose and meaning  after all.

...

In fact, apps may offer the possibility to help back the W3C.
But IMHO the soap api should offer more flexibility to the
app developer in order to be able to represent, formulate and
design his feedback links ans sub-systems the way he
want it, without any formal or style impeachments
Karim, tell me what you think of the post(s) I made earlier today regarding this very thing.  (In my proposed format I chose to keep the link url and title text -- but each in their own fields for the SOAP consumer).

Karim A. wrote:
> Regarding your suggestions for the output, things like:
> >   <m:explanationparagraphtext>
> ... sound a bit overkill. What's the gain between this and a <p>?

Again, I strongly believe that an HTML content isn't just text wrapped
with tags, and the html explanation the way it is given now has two
sides: the content and its inner semantics.
So, I agree, HTML explanation is richer hence better than flat text.
I agree with you here too (see my most recent posts). But I also made a subtle point (one that you've already made) -- that the w3 should remove any container/styling tags wrapping the explanations.

-Chris
< Prev | 1 - 2 | Next >