docbook 5, lxml and rng

View: New views
5 Messages — Rating Filter:   Alert me  

docbook 5, lxml and rng

by Tim Arnold :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi, this is a newbie question I'm sure.  I'm trying to validate an example straight out of the docbook 5 documentation (example given on the 'inlineequation' page).  As it stands, the file doesn't pass as valid.
The code:
=======================================
from lxml import etree
import os
# RNGDIR = 'path to docbook.rng'
# XMLDIR = 'path to the xml file'
relaxng_doc = etree.parse(os.path.join(RNGDIR,'docbook.rng'))
relaxng = etree.RelaxNG(relaxng_doc)

doc = etree.parse(os.path.join(XMLDIR,'myfile.xml'))
print relaxng.validate(doc)
=======================================

The xml file:
=======================================
<article xmlns="http://docbook.org/ns/docbook">
<title>Example inlineequation</title>

<para>Einstein's theory of relativity includes one of the most
widely recognized formulas in the world:
<inlineequation>
  <alt>e=mc^2</alt>
  <inlinemediaobject>
    <imageobject>
      <imagedata fileref="figures/emc2.png"/>
    </imageobject>
  </inlinemediaobject>
</inlineequation>
</para>

</article>
=======================================

If I remove the inlineequation subtree, it is valid.
Can someone help me understand what I'm missing?

python 2.5.1
lxml-2.1.2-py2.5-freebsd-6.3

thanks,
--Tim Arnold







_______________________________________________
XML-SIG maillist  -  XML-SIG@...
http://mail.python.org/mailman/listinfo/xml-sig

Re: docbook 5, lxml and rng

by Stefan Behnel-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

Tim Arnold wrote:

> Hi, this is a newbie question I'm sure. I'm trying to validate an
> example straight out of the docbook 5 documentation (example given
> on the 'inlineequation' page). As it stands, the file doesn't pass
> as valid.
>
> The code:
> =======================================
> from lxml import etree
> import os
> # RNGDIR = 'path to docbook.rng'
> # XMLDIR = 'path to the xml file'
> relaxng_doc = etree.parse(os.path.join(RNGDIR,'docbook.rng'))
> relaxng = etree.RelaxNG(relaxng_doc)
>
> doc = etree.parse(os.path.join(XMLDIR,'myfile.xml'))
> print relaxng.validate(doc)

What does the validator tell you why it's not considered valid? Note that
there's a property "error_log" which returns a sequence of messages that
were collected during validation.

http://codespeak.net/lxml/validation.html#relaxng

Stefan

_______________________________________________
XML-SIG maillist  -  XML-SIG@...
http://mail.python.org/mailman/listinfo/xml-sig

Re: docbook 5, lxml and rng

by Tim Arnold :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> -----Original Message-----
> From: Stefan Behnel [mailto:stefan_ml@...]
> Sent: Sunday, May 31, 2009 2:05 AM
> To: Tim Arnold
> Cc: xml-sig@...
> Subject: Re: [XML-SIG] docbook 5, lxml and rng
>
> Hi,
>
> Tim Arnold wrote:
> > Hi, this is a newbie question I'm sure. I'm trying to validate an
> > example straight out of the docbook 5 documentation (example given
> > on the 'inlineequation' page). As it stands, the file doesn't pass
> > as valid.
> >
> > The code:
> > =======================================
> > from lxml import etree
> > import os
> > # RNGDIR = 'path to docbook.rng'
> > # XMLDIR = 'path to the xml file'
> > relaxng_doc = etree.parse(os.path.join(RNGDIR,'docbook.rng'))
> > relaxng = etree.RelaxNG(relaxng_doc)
> >
> > doc = etree.parse(os.path.join(XMLDIR,'myfile.xml'))
> > print relaxng.validate(doc)
>
> What does the validator tell you why it's not considered valid? Note that
> there's a property "error_log" which returns a sequence of messages that
> were collected during validation.
>
> http://codespeak.net/lxml/validation.html#relaxng
>
> Stefan
>

Thanks, I should have looked at the documentation more before posting. I see what you're talking about now and I think I might have an explanation of what's going on.
The error_log says:
---------------------
4:0:ERROR:RELAXNGV:RELAXNG_ERR_ELEMWRONG: Did not expect element para there
4:0:ERROR:RELAXNGV:RELAXNG_ERR_ELEMNAME: Expecting element example, got para
4:0:ERROR:RELAXNGV:RELAXNG_ERR_ELEMNAME: Expecting element bridgehead, got para
4:0:ERROR:RELAXNGV:RELAXNG_ERR_EXTRACONTENT: Element para has extra content: text
4:0:ERROR:RELAXNGV:RELAXNG_ERR_ELEMNAME: Expecting element annotation, got para
4:0:ERROR:RELAXNGV:RELAXNG_ERR_CONTENTVALID: Element article failed to validate content
---------------------

But my libxml2 version is 5, which I think means that schematron isn't supported. And the docbook.rng contains some embedded schematron. From the DocBook 5 documentation:
---------------------
If you want to validate against the DocBook 5 RelaxNG schema, then you have to find the right validation tool. The DocBook 5 RelaxNG schema includes embedded Schematron rules to express certain constraints on some content models. For example, a Schematron rule is added to prevent a sidebar element from containing another sidebar. For complete validation, a validator needs to check both the RelaxNG content models and the Schematron rules.
---------------------


Does that make sense?
thanks,
--Tim Arnold

_______________________________________________
XML-SIG maillist  -  XML-SIG@...
http://mail.python.org/mailman/listinfo/xml-sig

Re: docbook 5, lxml and rng

by Stefan Behnel-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Tim Arnold wrote:

> my libxml2 version is 5, which I think means that schematron isn't
> supported. And the docbook.rng contains some embedded schematron. From
> the DocBook 5 documentation:
>
> ---------------------
> If you want to validate against the DocBook 5 RelaxNG schema, then you
> have to find the right validation tool. The DocBook 5 RelaxNG schema
> includes embedded Schematron rules to express certain constraints on
> some content models. For example, a Schematron rule is added to prevent
> a sidebar element from containing another sidebar. For complete
> validation, a validator needs to check both the RelaxNG content models
> and the Schematron rules.
> ---------------------

Yes, it looks like libxml2 can't handle Schematron annotations that are
embedded in RelaxNG schemas, even if both languages are supported separately.

Stefan
_______________________________________________
XML-SIG maillist  -  XML-SIG@...
http://mail.python.org/mailman/listinfo/xml-sig

Re: docbook 5, lxml and rng

by Bill Kinnersley :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Stefan Behnel wrote:

> Tim Arnold wrote:
>> my libxml2 version is 5, which I think means that schematron isn't
>> supported. And the docbook.rng contains some embedded schematron. From
>> the DocBook 5 documentation:
>>
>> ---------------------
>> If you want to validate against the DocBook 5 RelaxNG schema, then you
>> have to find the right validation tool. The DocBook 5 RelaxNG schema
>> includes embedded Schematron rules to express certain constraints on
>> some content models. For example, a Schematron rule is added to prevent
>> a sidebar element from containing another sidebar. For complete
>> validation, a validator needs to check both the RelaxNG content models
>> and the Schematron rules.
>> ---------------------
>
> Yes, it looks like libxml2 can't handle Schematron annotations that are
> embedded in RelaxNG schemas, even if both languages are supported separately.

Doesn't that just mean it skips over them?  I don't see how the
error_log entries Tim was getting would implicate Schematron.

Anyway, the RelaxNG specification for Docbook, I believe, is still quite
experimental.  Both jing and trang choke on it, so perhaps libxml2 may
be forgiven for choking also.

_______________________________________________
XML-SIG maillist  -  XML-SIG@...
http://mail.python.org/mailman/listinfo/xml-sig