How to improve the build speed with saxon 6.X / docbook

View: New views
9 Messages — Rating Filter:   Alert me  

How to improve the build speed with saxon 6.X / docbook

by Sylvestre Ledru-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hello,

I am currently trying to improve the build time of the documentation of
a free scientific software (Scilab).

There are almost 1800 XML files. The size of these files is between 1 k
to 10 k.
Before calling saxon, some processing is done (mathml => png through
jeuclid, etc) and finally merged all of them into a single file [1].
This file is processed against chunk.xsl or javahelp.xsl from
docbook-xsl. Both are taking a long time (pretty much the same).

However, the build time is way too long (between 30m to 60m on a
powerfull computer to hours on a small CPU). Especially for some small
architectures like s390 or armel... For example, Debian compilation
chains are killing the process since it is taking more than 150 minutes,
just to "load" the XML.

Therefor, I am trying to improve the speed of the process.
I wonder if there are any tricks to improve the speed. Some people told me that the merge of all xml files
is not necessary but I haven't been able to find how to do it.

I was wondering if there is a better way to structure the XML document.

For now, it is (mainly) structured the following way (by merge of all xml files):
<book>

<part>
<title>title of the chapter 1</title>
<refentry>
Details about the function
[...]
</refentry>
<refentry>
Details about the function 2
[...]
</refentry>
</part>

<part>
<title>title of the chapter 2</title>
<refentry>
[...]
</refentry>
</part>
</book>

Some rare refentry are also stored in some <chapter> <section>.

There are quite many links between all the refentry (especially coming
from the "see also" section).

Does anybody know how to improve this ?

Note that the PDF or PS generation is very fast and based on the same
master xml file.

Many thanks,
Sylvestre
PS: I sent this email on the saxon mailing list. They told me that this is most probably due to docbook and not saxon.
[1]
http://www.scilab.org/team/sylvestre.ledru/master_en_US_help-processed.xml.gz



---------------------------------------------------------------------
To unsubscribe, e-mail: docbook-apps-unsubscribe@...
For additional commands, e-mail: docbook-apps-help@...


Re: How to improve the build speed with saxon 6.X / docbook

by DeanNelson :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Some parts of this message have been removed. Learn more about Nabble's security policy.
Sylvestre,
 
A couple of things may help. Make sure that your catalogs are operating correctly and that the resolution of them are not going out to the net to resolve entries. This can slow things down greatly.
 
Also, have you thought about using XSLTPROC instead of Saxon. I maintain my Docbook tools so that I can use both Saxon and XSLTPROC but Saxon is slower. XSLTPROC is in most Linux distros and there is a Windows package also.
 
Which version of Saxon are you using and what version of Java?
 
Regards,
Dean Nelson
 
 
 
In a message dated 09/10/09 05:09:38 Pacific Daylight Time, sylvestre.ledru@... writes:
Hello,

I am currently trying to improve the build time of the documentation of
a free scientific software (Scilab).

There are almost 1800 XML files. The size of these files is between 1 k
to 10 k.
Before calling saxon, some processing is done (mathml => png through
jeuclid, etc) and finally merged all of them into a single file [1].
This file is processed against chunk.xsl or javahelp.xsl from
docbook-xsl. Both are taking a long time (pretty much the same).

However, the build time is way too long (between 30m to 60m on a
powerfull computer to hours on a small CPU). Especially for some small
architectures like s390 or armel... For example, Debian compilation
chains are killing the process since it is taking more than 150 minutes,
just to "load" the XML.

Therefor, I am trying to improve the speed of the process.
I wonder if there are any tricks to improve the speed. Some people told me that the merge of all xml files
is not necessary but I haven't been able to find how to do it.

I was wondering if there is a better way to structure the XML document.

For now, it is (mainly) structured the following way (by merge of all xml files):
<book>

<part>
<title>title of the chapter 1</title>
<refentry>
Details about the function
[...]
</refentry>
<refentry>
Details about the function 2
[...]
</refentry>
</part>

<part>
<title>title of the chapter 2</title>
<refentry>
[...]
</refentry>
</part>
</book>

Some rare refentry are also stored in some <chapter> <section>.

There are quite many links between all the refentry (especially coming
from the "see also" section).

Does anybody know how to improve this ?

Note that the PDF or PS generation is very fast and based on the same
master xml file.

Many t hanks,
Sylvestre
PS: I sent this email on the saxon mailing list. They told me that this is most probably due to docbook and not saxon.
[1]
http://www.scilab.org/team/sylvestre.ledru/master_en_US_help-processed.xml.gz



---------------------------------------------------------------------
To unsubscribe, e-mail: docbook-apps-unsubscribe@...
For additional commands, e-mail: docbook-apps-help@...

 

Re: How to improve the build speed with saxon 6.X / docbook

by Sylvestre Ledru-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hello,

Thanks for your quick answer.


Le jeudi 10 septembre 2009 à 06:59 -0700, DeanNelson a écrit :
> Sylvestre,
>  
> A couple of things may help. Make sure that your catalogs are
> operating correctly and that the resolution of them are not going out
> to the net to resolve entries. This can slow things down greatly.
A silly question :). How can I be sure of that ?

> Also, have you thought about using XSLTPROC instead of Saxon. I
> maintain my Docbook tools so that I can use both Saxon and XSLTPROC
> but Saxon is slower. XSLTPROC is in most Linux distros and there is a
> Windows package also.
I already checked with xsltproc and I have about the same time of
processing...
 
> Which version of Saxon are you using and what version of Java?
Saxon 6.5 and openjdk 6b16 (but I have the same issue with the Sun JDK).

Regards,
Sylvestre

> Regards,
> Dean Nelson
>  
>  
>  
> In a message dated 09/10/09 05:09:38 Pacific Daylight Time,
> sylvestre.ledru@... writes:
>         Hello,
>        
>         I am currently trying to improve the build time of the
>         documentation of
>         a free scientific software (Scilab).
>        
>         There are almost 1800 XML files. The size of these files is
>         between 1 k
>         to 10 k.
>         Before calling saxon, some processing is done (mathml => png
>         through
>         jeuclid, etc) and finally merged all of them into a single
>         file [1].
>         This file is processed against chunk.xsl or javahelp.xsl from
>         docbook-xsl. Both are taking a long time (pretty much the
>         same).
>        
>         However, the build time is way too long (between 30m to 60m on
>         a
>         powerfull computer to hours on a small CPU). Especially for
>         some small
>         architectures like s390 or armel... For example, Debian
>         compilation
>         chains are killing the process since it is taking more than
>         150 minutes,
>         just to "load" the XML.
>        
>         Therefor, I am trying to improve the speed of the process.
>         I wonder if there are any tricks to improve the speed. Some
>         people told me that the merge of all xml files
>         is not necessary but I haven't been able to find how to do
>         it.
>        
>         I was wondering if there is a better way to structure the XML
>         document.
>        
>         For now, it is (mainly) structured the following way (by merge
>         of all xml files):
>         <book>
>        
>         <part>
>         <title>title of the chapter 1</title>
>         <refentry>
>         Details about the function
>         [...]
>         </refentry>
>         <refentry>
>         Details about the function 2
>         [...]
>         </refentry>
>         </part>
>        
>         <part>
>         <title>title of the chapter 2</title>
>         <refentry>
>         [...]
>         </refentry>
>         </part>
>         </book>
>        
>         Some rare refentry are also stored in some <chapter>
>         <section>.
>        
>         There are quite many links between all the refentry
>         (especially coming
>         from the "see also" section).
>        
>         Does anybody know how to improve this ?
>        
>         Note that the PDF or PS generation is very fast and based on
>         the same
>         master xml file.
>        
>         Many t hanks,
>         Sylvestre
>         PS: I sent this email on the saxon mailing list. They told me
>         that this is most probably due to docbook and not saxon.
>         [1]
>         http://www.scilab.org/team/sylvestre.ledru/master_en_US_help-processed.xml.gz 
>        
>        
>        
>         ---------------------------------------------------------------------
>         To unsubscribe, e-mail:
>         docbook-apps-unsubscribe@...
>         For additional commands, e-mail:
>         docbook-apps-help@...
>        
>        
>  
>


---------------------------------------------------------------------
To unsubscribe, e-mail: docbook-apps-unsubscribe@...
For additional commands, e-mail: docbook-apps-help@...


RE: How to improve the build speed with saxon 6.X / docbook

by David Cramer (Tech Pubs) :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

 
> > A couple of things may help. Make sure that your catalogs are
> > operating correctly and that the resolution of them are not
> going out
> > to the net to resolve entries. This can slow things down greatly.
> A silly question :). How can I be sure of that ?

The way I usually realize when my catalogs aren't working is to build a
doc while offline :-)

> > Also, have you thought about using XSLTPROC instead of Saxon. I
> > maintain my Docbook tools so that I can use both Saxon and XSLTPROC
> > but Saxon is slower. XSLTPROC is in most Linux distros and
> there is a
> > Windows package also.
> I already checked with xsltproc and I have about the same
> time of processing...

That's been my experience too.

David

---------------------------------------------------------------------
To unsubscribe, e-mail: docbook-apps-unsubscribe@...
For additional commands, e-mail: docbook-apps-help@...


Re: How to improve the build speed with saxon 6.X / docbook

by DeanNelson :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

 
One way is to turn on the "verbosity" in the CatalogManager.properties file
 
verbosity=4
 
Non zero values print debugging info to the screen. This will be of great help in knowing if all of the net centric info is being resolved to a local location.
 
This all assumes that you are using a catalog ;-) If not then everything is going out to the net. When you tried XSLTPROC did you have the "--nonet" switch on the command line? I don't think Saxon has a similar switch, but it does have the CatalogManager.properties file which help these types of issues.
 
It may be only one unresolved entry that slow things down, so you will have to really look closely at the output.
 
Also, I use the Jueclid FOP plugin to render my MathML equations during the FOP generation. This saves a conversion step at the beginning.
 
Regards,
Dean Nelson
 
 
In a message dated 09/10/09 07:32:09 Pacific Daylight Time, dcramer@... writes:
> > A couple of things may help. Make sure that your catalogs are
> > operating correctly and that the resolution of them are not
> going out
> > to the net to resolve entries. This can slow things down greatly.
> A silly question :). How can I be sure of that ?

The way I usually realize when my catalogs aren't working is to build a
doc while offline :-)
 

Re: How to improve the build speed with saxon 6.X / docbook

by Sylvestre Ledru-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Bad luck, it does seems to be related to the network ...
I tried with both and it is pretty much the same time.

I am going the same as you about MathML.

Thanks again for your advices!
Sylvestre


Le jeudi 10 septembre 2009 à 07:57 -0700, DeanNelson a écrit :

>  
> One way is to turn on the "verbosity" in the CatalogManager.properties
> file
>  
> verbosity=4
>  
> Non zero values print debugging info to the screen. This will be of
> great help in knowing if all of the net centric info is being resolved
> to a local location.
>  
> This all assumes that you are using a catalog ;-) If not then
> everything is going out to the net. When you tried XSLTPROC did you
> have the "--nonet" switch on the command line? I don't think Saxon has
> a similar switch, but it does have the CatalogManager.properties file
> which help these types of issues.
>  
> It may be only one unresolved entry that slow things down, so you will
> have to really look closely at the output.
>  
> Also, I use the Jueclid FOP plugin to render my MathML equations
> during the FOP generation. This saves a conversion step at the
> beginning.
>  
> Regards,
> Dean Nelson
>  
>  
> In a message dated 09/10/09 07:32:09 Pacific Daylight Time,
> dcramer@... writes:
>         > > A couple of things may help. Make sure that your catalogs
>         are
>         > > operating correctly and that the resolution of them are
>         not
>         > going out
>         > > to the net to resolve entries. This can slow things down
>         greatly.
>         > A silly question :). How can I be sure of that ?
>        
>         The way I usually realize when my catalogs aren't working is
>         to build a
>         doc while offline :-)
>        
>  
>


---------------------------------------------------------------------
To unsubscribe, e-mail: docbook-apps-unsubscribe@...
For additional commands, e-mail: docbook-apps-help@...


Re: How to improve the build speed with saxon 6.X / docbook

by Dave Pawson :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On 10/09/09 13:09, Sylvestre Ledru wrote:

> Hello,
>
> I am currently trying to improve the build time of the documentation of
> a free scientific software (Scilab).
>
> There are almost 1800 XML files. The size of these files is between 1 k
> to 10 k.
> Before calling saxon, some processing is done (mathml =>  png through
> jeuclid, etc) and finally merged all of them into a single file [1].
> This file is processed against chunk.xsl or javahelp.xsl from
> docbook-xsl. Both are taking a long time (pretty much the same).
>
> However, the build time is way too long (between 30m to 60m on a
> powerfull computer to hours on a small CPU). Especially for some small
> architectures like s390 or armel... For example, Debian compilation
> chains are killing the process since it is taking more than 150 minutes,
> just to "load" the XML.
>
> Therefor, I am trying to improve the speed of the process.
> I wonder if there are any tricks to improve the speed. Some people told me that the merge of all xml files
> is not necessary but I haven't been able to find how to do it.


Have you tried compiling the stylesheets using the saxon option?
Not something I've done, nor something I've heard being done on this
list, but definitely should show an improvement






regards

--
Dave Pawson
XSLT XSL-FO FAQ.
http://www.dpawson.co.uk

---------------------------------------------------------------------
To unsubscribe, e-mail: docbook-apps-unsubscribe@...
For additional commands, e-mail: docbook-apps-help@...


Re: How to improve the build speed with saxon 6.X / docbook

by Sylvestre Ledru-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Le vendredi 11 septembre 2009 à 06:13 +0100, Dave Pawson a écrit :

> On 10/09/09 13:09, Sylvestre Ledru wrote:
> > Hello,
> >
> > I am currently trying to improve the build time of the documentation of
> > a free scientific software (Scilab).
> >
> > There are almost 1800 XML files. The size of these files is between 1 k
> > to 10 k.
> > Before calling saxon, some processing is done (mathml =>  png through
> > jeuclid, etc) and finally merged all of them into a single file [1].
> > This file is processed against chunk.xsl or javahelp.xsl from
> > docbook-xsl. Both are taking a long time (pretty much the same).
> >
> > However, the build time is way too long (between 30m to 60m on a
> > powerfull computer to hours on a small CPU). Especially for some small
> > architectures like s390 or armel... For example, Debian compilation
> > chains are killing the process since it is taking more than 150 minutes,
> > just to "load" the XML.
> >
> > Therefor, I am trying to improve the speed of the process.
> > I wonder if there are any tricks to improve the speed. Some people told me that the merge of all xml files
> > is not necessary but I haven't been able to find how to do it.
>
>
> Have you tried compiling the stylesheets using the saxon option?
> Not something I've done, nor something I've heard being done on this
> list, but definitely should show an improvement
I am going to try. Do you have any pointer/documentation on this ?

Thanks
Sylvestre



---------------------------------------------------------------------
To unsubscribe, e-mail: docbook-apps-unsubscribe@...
For additional commands, e-mail: docbook-apps-help@...


Re: How to improve the build speed with saxon 6.X / docbook

by Dave Pawson :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On 14/09/09 10:13, Sylvestre Ledru wrote:

>> Have you tried compiling the stylesheets using the saxon option?
>> Not something I've done, nor something I've heard being done on this
>> list, but definitely should show an improvement
> I am going to try. Do you have any pointer/documentation on this ?
>
> Thanks
> Sylvestre

I think I owe you an apology. Seems that compiling a stylesheet is a
Saxon 9 option, not available on saxon 6.5.5 which is the XSLT 1.0
engine needed for docbook.

Can anyone confirm this?
I'm looking  at

http://saxon.sourceforge.net/saxon6.5.5/using-xsl.html#Command-line
as the command line options.

http://www.saxonica.com/documentation/using-xsl/compiling.html documents
using the pre-compiled stylesheet with xslt 2.0


regards

--
Dave Pawson
XSLT XSL-FO FAQ.
http://www.dpawson.co.uk

---------------------------------------------------------------------
To unsubscribe, e-mail: docbook-apps-unsubscribe@...
For additional commands, e-mail: docbook-apps-help@...