--xml and different node and sectionning structures

View: New views
10 Messages — Rating Filter:   Alert me  

--xml and different node and sectionning structures

by Patrice Dumas :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hello,

When there is a @node associated with every sectionning @-command, the
output of --xml is every section is simply nested in the corresponding <node>.
However, when there are no @node, the nesting of elements follows the
sectionning, like

<chapter>
  ....
 <section>
  ...
 </section>
</chapter>

When there is a mix, typically with more sectionning @-commands than @node
and even more if the @node structure doesn't follow the sectionning
@-command structure, the nesting may not be easy to sort out. I attach an
example of such a case, on which C makeinfo fails with --xml. This is of
course a test case and not a real life manual, but I think that it
demonstrates issues than may arise in real life manuals.

Currently the texi2html design is to completly ignore the tree structure
of sectionning @-commands, and have no nesting of the corresponding
xml, even if there are no nodes.

What is your opinion on the desirable output with --xml?

--
Pat


more_sections_than_nodes.texi (579 bytes) Download Attachment

Re: --xml and different node and sectionning structures

by Karl Berry :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

    Currently the texi2html design is to completly ignore the tree structure
    of sectionning @-commands, and have no nesting of the corresponding
    xml, even if there are no nodes.

Sounds sensible to me.

Torsten, are you there?  I think your texi2latex is based on the
makeinfo --xml output.  Do you have any opinion about this?

Thanks,
Karl




Re: --xml and different node and sectionning structures

by Torsten Bronger :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hallöchen!

Karl Berry writes:

>     Currently the texi2html design is to completly ignore the tree structure
>     of sectionning @-commands, and have no nesting of the corresponding
>     xml, even if there are no nodes.
>
> Sounds sensible to me.
>
> Torsten, are you there?  I think your texi2latex is based on the
> makeinfo --xml output.  Do you have any opinion about this?

texi2latex doesn't care about the nesting.

By the way, DocBook output needs the nesting and works smoothly.

Tschö,
Torsten.

--
Torsten Bronger, aquisgrana, europa vetus
                   Jabber ID: torsten.bronger@...
                                  or http://bronger-jmp.appspot.com




Re: --xml and different node and sectionning structures

by Patrice Dumas :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, Aug 11, 2009 at 03:58:51AM +0200, Torsten Bronger wrote:
>
> texi2latex doesn't care about the nesting.
>
> By the way, DocBook output needs the nesting and works smoothly.

Indeed, but it ignores completly the nodes structure.

--
Pat



Re: --xml and different node and sectionning structures

by Torsten Bronger :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hallöchen!

Patrice Dumas writes:

> On Tue, Aug 11, 2009 at 03:58:51AM +0200, Torsten Bronger wrote:
>
>> texi2latex doesn't care about the nesting.
>>
>> By the way, DocBook output needs the nesting and works smoothly.
>
> Indeed, but it ignores completly the nodes structure.

Because it doesn't need it.  But it does need the nesting, so I
think it's sensible to have nesting in the Texinfo XML.  Otherwise,
XML-based converters with nesting target formats will have to
reconstruct it.  It's much easier to flatten a structure than vice
versa.

I think Texinfo XML's nesting should reflect secioning only, and
nodes should be just empty elements, as the optional first child of
a section.  If I remember correctly, bad things (warnings/errors)
happen anyway if the node and sectioning structures of a document
are too much diverging.  Unfortunately, the format allows this.

Oh well.

Visiting Texinfo after a long time, I wonder whether there is
motivation and spare time to overhaul Texinfo, accepting *small*
incompatibilities?

As with Python 3 recently, one could create a converter to the new
syntax.  Makeinfo could even detect an old version and run the
converter internally (emitting a warning about the waste of time).

Some things that I find very important:

(1) GIVE UP the tight binding to TeX!!  Consider TeX a mere backend
    to which Texinfo must be converted.

(2) UTF-8 and localisation stuff.  Almost impossible without (1).

(3) Floats and graphics sorted out for all backends.  Very difficult
    without (1).

(4) Syntax improvements.  Many commands have gathered too many
    arguments, there are too many cross reference commands, and the
    conditional processing should be put in question.

(5) Other things that I can't recollect because it's been too long
    ago.

I have the impression that further work on Texinfo has become a
rather unthankful task.  Maybe the Gordian knot must be cut (and
this is the connection to TeX ;-)?

Apart from the work of an overhaul and the necessity to provide
automatic conversion, generating Info files still must be fast,
given the number of make files that make them.  The other backends,
however, can be programmed comfortably in various languages.  Doing
text-crunching in C is awful.

Tschö,
Torsten.

--
Torsten Bronger, aquisgrana, europa vetus
                   Jabber ID: torsten.bronger@...
                                  or http://bronger-jmp.appspot.com




Re: --xml and different node and sectionning structures

by Patrice Dumas :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, Aug 11, 2009 at 10:47:10AM +0200, Torsten Bronger wrote:

> Hallöchen!
>
> Because it doesn't need it.  But it does need the nesting, so I
> think it's sensible to have nesting in the Texinfo XML.  Otherwise,
> XML-based converters with nesting target formats will have to
> reconstruct it.  It's much easier to flatten a structure than vice
> versa.
>
> I think Texinfo XML's nesting should reflect secioning only, and
> nodes should be just empty elements, as the optional first child of
> a section.  

That could be an easy way out. The problem is that the opposite is more
consistent with the texinfo way, a section is within a node. But another
easy way out would be to have nodes as stand-alone elements, but not
first childs of sections, they should just be output where they appear.
Karl, does it look good to you?

> If I remember correctly, bad things (warnings/errors)
> happen anyway if the node and sectioning structures of a document
> are too much diverging.  Unfortunately, the format allows this.

They have to be consistent, but this doesn't allow an easy nesting.
Unless I missed something, the document I attached at the start of the
thread is valid texinfo and can be processed as info, docbook, html
and tex. It gives no error message but something that looks like an
internal error when processed by makeinfo --xml.

> Visiting Texinfo after a long time, I wonder whether there is
> motivation and spare time to overhaul Texinfo, accepting *small*
> incompatibilities?

This is more or less happening at the software level, since
texi2html (written in perl) will replace makeinfo in C soon. But
it should be fully compatible (except that @macro would become
like @rmacro) at the input format level.

> As with Python 3 recently, one could create a converter to the new
> syntax.  Makeinfo could even detect an old version and run the
> converter internally (emitting a warning about the waste of time).
>
> Some things that I find very important:
>
> (1) GIVE UP the tight binding to TeX!!  Consider TeX a mere backend
>     to which Texinfo must be converted.

It seems to me to be already the case (at least for me). I think that
there is Texinfo as a language which is distinct from the software
implementations that convert it to other formats. Binding Texinfo
to info is, in my opinion, also an error, since Info has some limitations
that, in my opinion, should not be taken too much in consideration
when defining the format (for example no comma in node names). Sure
this could lead to manuals that are not readable in info, and the
converter should say it, but it is different from an issue in the
Texinfo format itself.

The major advantage of switching to texi2html is that in texi2html
the distinction between the texinfo parsing and the output formatting
is better done than in makeinfo C, allowing to separate more the
language and the output backends (and allowing for easier backend
generation).

> (2) UTF-8 and localisation stuff.  Almost impossible without (1).

UTF-8 basic is already done, and better localisation support (with a
gettext-like framework) It is the last thing that has to be done to have
the perl implementation replace the C implementation. There is still
work to do on the UTF-8 side, but using what recent perl provides should be
enough.
 
> (3) Floats and graphics sorted out for all backends.  Very difficult
>     without (1).

@float support could be much improved in the docbook output, but in
the other outputs, I can't really see how to do better.
 
> (4) Syntax improvements.  Many commands have gathered too many
>     arguments, there are too many cross reference commands, and the
>     conditional processing should be put in question.

What do you mean with 'there are too many cross reference commands'?
Do you mean that xref, pxref and ref should be only one command?

Which commands has too many argument?

and what problems do you see with the conditional processing?



I personnally think that there are problems with the commands ended
by end of lines since it is not easy to have a proper nesting, mostly
@item in @(v|f)?table and @center. Also I think that removing spaces
after @anchor{} is inconsistent because it is a command with braces.
It also seems to me that the whole @def* is dubious because it has
a different syntax than other @-commands regarding arguments separation.
Overall, using only one way to give arguments to @-commands (and
certainly arguments separated by commas) would be better, in my opinion.

Karl is unhappy with the @macro stuff, too (and maybe with conditionals?)

> I have the impression that further work on Texinfo has become a
> rather unthankful task.  Maybe the Gordian knot must be cut (and
> this is the connection to TeX ;-)?
>
> Apart from the work of an overhaul and the necessity to provide
> automatic conversion, generating Info files still must be fast,
> given the number of make files that make them.  The other backends,
> however, can be programmed comfortably in various languages.  Doing
> text-crunching in C is awful.

With the switch to perl implementation, generating info won't be fast
anymore (texi2html is awfuly slow), but doing other backends should
be fairly easy. makeinfo in C is still there anyway for those who want
something fast.

--
Pat



Re: --xml and different node and sectionning structures

by Torsten Bronger :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hallöchen!

Patrice Dumas writes:

> On Tue, Aug 11, 2009 at 10:47:10AM +0200, Torsten Bronger wrote:
>
>> [...]
>>
>> I think Texinfo XML's nesting should reflect secioning only, and
>> nodes should be just empty elements, as the optional first child
>> of a section.
>
> That could be an easy way out. The problem is that the opposite is
> more consistent with the texinfo way, a section is within a
> node. But another easy way out would be to have nodes as
> stand-alone elements, but not first childs of sections, they
> should just be output where they appear.

This way is probably better.  Anyway, only for Info, nodes are
structuring.  For almost all other use cases, they are label
providers, not more.

> [...]
>
>> Visiting Texinfo after a long time, I wonder whether there is
>> motivation and spare time to overhaul Texinfo, accepting *small*
>> incompatibilities?
>
> This is more or less happening at the software level, since
> texi2html (written in perl) will replace makeinfo in C soon.

Sounds good.

> But it should be fully compatible (except that @macro would become
> like @rmacro) at the input format level.

The imcompatibility thing is not as important as the other things
(e.g. UTF-8).

> [...]
>  
>> (4) Syntax improvements.  Many commands have gathered too many
>>     arguments, there are too many cross reference commands, and the
>>     conditional processing should be put in question.
>
> What do you mean with 'there are too many cross reference
> commands'?  Do you mean that xref, pxref and ref should be only
> one command?

Yes.  They have very complicated semantics anyway, with inserted
words and punctuation.  Besides, it is impossible to guarantee
grammatically and orthographically correct output.  LaTeX's ref and
pageref model is superior and easier to handle.

> Which commands has too many argument?

xref.  The docs even contain different sections for the different
numbers of arguments!

> and what problems do you see with the conditional processing?

I think it is not necessary.  In rare cases where it makes sense
(images come to my mind), one can make them explicit part of the
syntax.  By the way, what is allowed in macros should be clearly
defined (and restricted to what is generally parsable).

Tschö,
Torsten.

--
Torsten Bronger, aquisgrana, europa vetus
                   Jabber ID: torsten.bronger@...
                                  or http://bronger-jmp.appspot.com




Re: --xml and different node and sectionning structures

by Karl Berry :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

     another
    easy way out would be to have nodes as stand-alone elements, but not
    first childs of sections, they should just be output where they appear.
    Karl, does it look good to you?

Yes, sounds fine.  Nodes and sections are theoretically independent.

    Karl is unhappy with the @macro stuff, too (and maybe with conditionals?)

I'm unhappy with lots of things about Texinfo design (including the
above), but there are an awful lot of manuals written using it.  We
can't just arbitrarily declare feature X to be dead, no matter how ugly
it is.

For example, we could make yet another reference command, but it's not
an easy thing to do.  We already have @ref, which seems about as minimal
as it can be.  Anything related only to page numbers can't be used in
Info output.

The node name restrictions are perhaps the worst user-level problem
IMHO, but the Info readers have to be fixed, and I just don't have the
time or energy to make that happen.

karl



Re: --xml and different node and sectionning structures

by Patrice Dumas :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, Aug 11, 2009 at 01:29:11PM -0500, Karl Berry wrote:
>      another
>     easy way out would be to have nodes as stand-alone elements, but not
>     first childs of sections, they should just be output where they appear.
>     Karl, does it look good to you?
>
> Yes, sounds fine.  Nodes and sections are theoretically independent.

So currently, the output for something along

  @chapter Chap1

  @node chap2
  @chapter Chap2


Is:
  <chapter>
  <title>Chap1</title>

  <node>
  <nodename>chap2</nodename>
  </node>
  </chapter>
  <chapter>
  <title>Chap2</title>
  </chapter>

Which mleans that the node `chap2' is within the <chapter> with title
`Chap1' element. Is it right?

> For example, we could make yet another reference command, but it's not
> an easy thing to do.  We already have @ref, which seems about as minimal
> as it can be.  Anything related only to page numbers can't be used in
> Info output.

I don't know if this is the right thing to do, but if we want to simplify
things, we can drop @xref and @pxref from the documentation and say that
there are 3 ways to call @ref:

* with one argument it is a reference to a node or an anchor within the
  document,
* with argument 1 and 4 it is a reference to an info manual,
* with argument 3 and 5 it is areference to a book,

and it is possible to combine reference to a book and to an info manual.
Not saying that it is the right thing to do, just how it could be
simplified.

--
Pat



Re: --xml and different node and sectionning structures

by Karl Berry :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

    Which mleans that the node `chap2' is within the <chapter> with title
    `Chap1' element. Is it right?

Clearly it's wrong.
Can you close the previous sectioning element when you see the @node?

    we can drop @xref and @pxref from the documentation and say

Thanks for the idea, but I don't think that is the right way to go.