[MathML3-last-call] Improving MathML internationalization capabilities

View: New views
5 Messages — Rating Filter:   Alert me  

[MathML3-last-call] Improving MathML internationalization capabilities

by Jirka Kosek :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

I'm writing you on behalf ITS IG (former ITS WG). We have reviewed
MathML3 LC especially in respect to internationalization capabilities
and Best practices for XML internationalization
(http://www.w3.org/TR/xml-i18n-bp/).

Despite MathML content usually contains only language neutral
mathematical expression there are few issues related to
internationalization which we would like to see improved before going to
recommendation.

1. It should be possible to specify/change directionality of text on
mtext element using dir attribute (currently attribute is not allowed
there).

For more background information see http://www.w3.org/TR/xml-i18n-bp/#DevDir

2. It should be possible to specify language of content using xml:lang
attribute at least on mtext and math elements.

For more background information see
http://www.w3.org/TR/xml-i18n-bp/#DevLang

3. It should be possible to specify directionality and language not only
 for whole mtext element but also for parts of text inside the element.
For example for cases when mtext contains English text with foreign
phrase inside etc. This can be easily accomplished by adding child
element to mtext (e.g. span, mspan, phrase, ...) which can have dir,
xml:lang and any number of other foreign attributes (e.g. ITS local
markup). Such element should allow nesting to handle rare cases where
more scripts are mixed at the same time.

<math>
...
<mtext xml:lang="en">In Hebrew, the title
     <mspan xml:lang="he"
     dir="rtl">פעילות הבינאום, W3C</mspan>
     means Internationalization Activity, W3C.</mtext>
...
</math>

4. It should be possible to specify Ruby annotation inside mtext. This
can be accomplished by allowing ruby markup inside mtext:

http://www.w3.org/TR/xml-i18n-bp/#DevRuby


We are looking forward to your response and wish you success with your
important specification.

                        Jirka Kosek
                        on behalf of ITS IG
                        http://www.w3.org/International/its/ig/


--
------------------------------------------------------------------
  Jirka Kosek      e-mail: jirka@...      http://xmlguru.cz
------------------------------------------------------------------
       Professional XML consulting and training services
  DocBook customization, custom XSLT/XSL-FO document processing
------------------------------------------------------------------
 OASIS DocBook TC member, W3C Invited Expert, ISO JTC1/SC34 member
------------------------------------------------------------------



signature.asc (267 bytes) Download Attachment

Re: [MathML3-last-call] Improving MathML internationalization capabilities

by David Carlisle :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message




Jirka, ITS IG,

Thank you for your comments.


>
>
> Despite MathML content usually contains only language neutral
> mathematical expression there are few issues related to
> internationalization which we would like to see improved before going to
> recommendation.
>
> 1. It should be possible to specify/change directionality of text on
> mtext element using dir attribute (currently attribute is not allowed
> there).
>
> For more background information see http://www.w3.org/TR/xml-i18n-bp/#Dev=
> Dir


Our thought here was that the provision of dir to set the direction of
the layout direction and initial text direction, together with the
Unicode Bidi algorithm handling text runs would be sufficient.
When in very rare edge cases you need to set the initial direction of
text you may use <mrow dir="rtl"><mtext>....</mtext></mrow>, however
adding dir to mtext would make this a little less verbose so we
propose to add dir to the attributes shared by all token elements.


> 2. It should be possible to specify language of content using xml:lang
> attribute at least on mtext and math elements.
>
> For more background information see
> http://www.w3.org/TR/xml-i18n-bp/#DevLang


This is already allowed as xml:anything is allowed on every MathML element.
Any namespaced attribute is allowed and of course the xml: attributes
are particularly  easy as the xml namespace is pre-declared. We will
add a sentence to the specification where it is discussing common
attributes highlighting this and giving xml:lang as an example.

This comment has also highlighted that while the current RelaxNG and
XSD schemas currently allow xml: attributes, the DTD does not. This is
a bug in the Relax to DTD conversion used that will be fixed. The DTD
is non normative and not in TR space so we can fix this inline,
possibly this week.

>
> 3. It should be possible to specify directionality and language not only
>  for whole mtext element but also for parts of text inside the element.
> For example for cases when mtext contains English text with foreign
> phrase inside etc. This can be easily accomplished by adding child
> element to mtext (e.g. span, mspan, phrase, ...) which can have dir,
> xml:lang and any number of other foreign attributes (e.g. ITS local
> markup). Such element should allow nesting to handle rare cases where
> more scripts are mixed at the same time.
>
>
> 4. It should be possible to specify Ruby annotation inside mtext. This
> can be accomplished by allowing ruby markup inside mtext:
>
> http://www.w3.org/TR/xml-i18n-bp/#DevRuby
>

Both 3 and 4 are related comments, proposing extending the content
model of mtext.

There are competing pressures to allow markup inside mtext for all
sorts of reasons and allowing MathML specific markup would complicate
this extension point greatly.  Chapter 6 currently states that if you
are using a compound document format with MathML embedded in some
larger document type that you are advised to open up token elements to
allow foreign namespaced elements. So in xhtml+mathml you could allow
xhtml spans and ruby markup. If there were MathML specific markup
inline as well this would complicate the interaction, similarly in
MathML+docbook one would want to use docbook inline elements for
marking up text, not mathml.

We plan to revise the text in chapter 6

http://www.w3.org/TR/MathML3/chapter6.html#world-int-combine-other

to make this clearer and could add Ruby as an example here.

As an alternative to allowing xhtml+ruby inside mtext via an extended
schema as discussed section 6.4, one could use an xhtml+ruby
annotation in the unextended schema.

<semantics>
  <mtext>basic fallback text</mtext>
  <annotation-xml encoding=...>
   <span xhtml="http://www.w3.org/1999/xhtml">
     ... xhtml + Ruby markup ...
   </span>
  </annotation-xml>
</semantics>

We hope that you will agree that these two mechanisms to provide the
required functionality here.


David Carlisle
For the Math WG

________________________________________________________________________
The Numerical Algorithms Group Ltd is a company registered in England
and Wales with company number 1249803. The registered office is:
Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom.

This e-mail has been scanned for all viruses by Star. The service is
powered by MessageLabs.
________________________________________________________________________


Re: [MathML3-last-call] Improving MathML internationalization capabilities

by Jirka Kosek :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi David,

many thanks for very prompt reply.

> When in very rare edge cases you need to set the initial direction of
> text you may use <mrow dir="rtl"><mtext>....</mtext></mrow>, however
> adding dir to mtext would make this a little less verbose so we
> propose to add dir to the attributes shared by all token elements.

Sounds great.

>> 2. It should be possible to specify language of content using xml:lang
>> attribute at least on mtext and math elements.
>>
>> For more background information see
>> http://www.w3.org/TR/xml-i18n-bp/#DevLang
>
> This is already allowed as xml:anything is allowed on every MathML element.
> Any namespaced attribute is allowed and of course the xml: attributes
> are particularly  easy as the xml namespace is pre-declared.

Are you sure that xml:* is really allowed? If I haven't missed something
RELAX NG schema for MathML defines pattern which is used to allow
foreign attributes on MathML elements as

NonMathMLAtt = attribute (* - (local:*|xml:*)) {xsd:string}

and as you can see all xml:* attributes are explicitly excluded here.

> This comment has also highlighted that while the current RelaxNG and
> XSD schemas currently allow xml: attributes, the DTD does not. This is
> a bug in the Relax to DTD conversion used that will be fixed. The DTD
> is non normative and not in TR space so we can fix this inline,
> possibly this week.

Yep, producing reasonable DTDs in namespace ages is a real pain :-(

> Both 3 and 4 are related comments, proposing extending the content
> model of mtext.
>
> There are competing pressures to allow markup inside mtext for all
> sorts of reasons and allowing MathML specific markup would complicate
> this extension point greatly.  Chapter 6 currently states that if you
> are using a compound document format with MathML embedded in some
> larger document type that you are advised to open up token elements to
> allow foreign namespaced elements. So in xhtml+mathml you could allow
> xhtml spans and ruby markup. If there were MathML specific markup
> inline as well this would complicate the interaction, similarly in
> MathML+docbook one would want to use docbook inline elements for
> marking up text, not mathml.
This sounds reasonable. Shouldn't then token.content pattern explicitly
allow any non-MathML content by default so MathML fragments with embeded
XHTML/DocBook/whatever will be valid against base MathML schema not only
against specific schema derived from base MathML schema?

> We plan to revise the text in chapter 6
>
> http://www.w3.org/TR/MathML3/chapter6.html#world-int-combine-other
>
> to make this clearer and could add Ruby as an example here.

Excellent.

> We hope that you will agree that these two mechanisms to provide the
> required functionality here.

Indeed. I think that once issue related to NonMathMLAtt pattern in the
schema and xml:lang is resolved comments from ITS IG can be treated as
resolved.

Have a nice day,

                                Jirka

--
------------------------------------------------------------------
  Jirka Kosek      e-mail: jirka@...      http://xmlguru.cz
------------------------------------------------------------------
       Professional XML consulting and training services
  DocBook customization, custom XSLT/XSL-FO document processing
------------------------------------------------------------------
 OASIS DocBook TC member, W3C Invited Expert, ISO JTC1/SC34 member
------------------------------------------------------------------



signature.asc (267 bytes) Download Attachment

Re: [MathML3-last-call] Improving MathML internationalization capabilities

by David Carlisle :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


> Are you sure that xml:* is really allowed?

Sorry. That is a regression. Thanks for catching this. It was allowed in
a previous version and should be allowed on all elements. The xml
namespace was omitted from that production as it was allowed by other
productions in the schema.  There were problems with (as I recall) ID
types from xml:id being declared and so the schema was refactored a bit
but as you correctly point out, the current state is that xml:
attributes are not allowed at all according to the schema. This was
unintentional and will be fixed. I'm sorry for the misinformation in the
first reply.

> This sounds reasonable. Shouldn't then token.content pattern explicitly
> allow any non-MathML content by default so MathML fragments with embeded
> XHTML/DocBook/whatever will be valid against base MathML schema not only
> against specific schema derived from base MathML schema?

MathML is also used in many systems that are not going to be able to cope
with structured text (computer algebra systems for example) Keeping the
content model for mtext as plain text unless the schema is explictly
extended means that a mathml fragment that validates against the
normative schema is maximally portable. We hope to make it as easy as
possible for people to extend the schema. For example, if thought
desirable we could distribute explicit examples such as a schema with
the content model opened to allow any foreign namespaced elements, or a
specific xhtml+mathml version that allows inline xhtml markup.

David



________________________________________________________________________
The Numerical Algorithms Group Ltd is a company registered in England
and Wales with company number 1249803. The registered office is:
Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom.

This e-mail has been scanned for all viruses by Star. The service is
powered by MessageLabs.
________________________________________________________________________


Re: [MathML3-last-call] Improving MathML internationalization capabilities

by David Carlisle :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message



Jirka,

> Are you sure that xml:* is really allowed?

I think I fixed this, it turned out to be rather more delicate than I
thought (fixing the RelaxNG was easy but that broke the special case
simplifications I was using before applying trang to get the dtd.)
However some test files with xml:lang and xml:space now validate
with both Relax and DTD. We are building up the MathMl3 test suite to
get better coverage and we'll be validating the test suite against relax
xsd and dtd schema, I will add some xml:* attribute tests to the test
suite to check that this doesn't regress.

I have checked in updated schema in public space and a draft with an
updated appendix A.

http://monet.nag.co.uk/~dpc/draft-spec/appendixa.html#parsing_NonMathMLAtt


Just to confirm, for those not interested in arcane schema details,
xml:* (and in particular, xml:lang) is (once again) allowed on every
MathML element.

Thanks again for catching this.

David

PS

For those who _are_ interested in arcane schema details....



Basically the change was to allow any attributes not in the null
namespace or the mathml namespace (rather than not in null namepsace or
xml namespace) as shown in the editors' draft above.

However this caused a bit of refactoring to be necessary and the
following files changed in our internal sources:


$ cvs commit -m "xml namespace attributes"

Checking in makedtd;
/w3ccvs/WWW/Math/Group/RelaxNG/mathml3/makedtd,v  <--  makedtd
new revision: 1.8; previous revision: 1.7
done
Checking in mathml3-common.rnc;
/w3ccvs/WWW/Math/Group/RelaxNG/mathml3/mathml3-common.rnc,v  <--  mathml3-common.rnc
new revision: 1.35; previous revision: 1.34
done
Checking in mathml3-common.rng;
/w3ccvs/WWW/Math/Group/RelaxNG/mathml3/mathml3-common.rng,v  <--  mathml3-common.rng
new revision: 1.33; previous revision: 1.32
done
Checking in mathml3-presentation.rnc;
/w3ccvs/WWW/Math/Group/RelaxNG/mathml3/mathml3-presentation.rnc,v  <--  mathml3-presentation.rnc
new revision: 1.53; previous revision: 1.52
done
Checking in mathml3-presentation.rng;
/w3ccvs/WWW/Math/Group/RelaxNG/mathml3/mathml3-presentation.rng,v  <--  mathml3-presentation.rng
new revision: 1.36; previous revision: 1.35
done
Checking in rngdtd.xsl;
/w3ccvs/WWW/Math/Group/RelaxNG/mathml3/rngdtd.xsl,v  <--  rngdtd.xsl
new revision: 1.7; previous revision: 1.6
done
Checking in dtd/mathml3.dtd;
/w3ccvs/WWW/Math/Group/RelaxNG/mathml3/dtd/mathml3.dtd,v  <--  mathml3.dtd
new revision: 1.27; previous revision: 1.26
done
Checking in xsd/mathml3-common.xsd;
/w3ccvs/WWW/Math/Group/RelaxNG/mathml3/xsd/mathml3-common.xsd,v  <--  mathml3-common.xsd
new revision: 1.9; previous revision: 1.8
done
Checking in xsd/mathml3-content.xsd;
/w3ccvs/WWW/Math/Group/RelaxNG/mathml3/xsd/mathml3-content.xsd,v  <--  mathml3-content.xsd
new revision: 1.8; previous revision: 1.7
done
Checking in xsd/mathml3-presentation.xsd;
/w3ccvs/WWW/Math/Group/RelaxNG/mathml3/xsd/mathml3-presentation.xsd,v  <--  mathml3-presentation.xsd
new revision: 1.24; previous revision: 1.23
done
Checking in xsd/mathml3-strict-content.xsd;
/w3ccvs/WWW/Math/Group/RelaxNG/mathml3/xsd/mathml3-strict-content.xsd,v  <--  mathml3-strict-content.xsd
new revision: 1.7; previous revision: 1.6
done
Checking in xsd/mathml3.xsd;
/w3ccvs/WWW/Math/Group/RelaxNG/mathml3/xsd/mathml3.xsd,v  <--  mathml3.xsd
new revision: 1.3; previous revision: 1.2
done

________________________________________________________________________
The Numerical Algorithms Group Ltd is a company registered in England
and Wales with company number 1249803. The registered office is:
Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom.

This e-mail has been scanned for all viruses by Star. The service is
powered by MessageLabs.
________________________________________________________________________