|
View:
New views
10 Messages
—
Rating Filter:
Alert me
|
|
|
(output) encoding support in doxia-sink-apiHi folks,
For the next 1.0-beta-1 version, 2 methods have been added to SinkFactory interface to improve encoding support: - Sink createSink( File outputDir, String outputName, String encoding ) - Sink createSink( Writer writer ) See [1] for the full interface. I worked with Vincent to implement output encoding in Doxia, and we faced problems that lead us think that forcing a fixed encoding was the right approach to have something simple and reliable: with UTF-8 as this fixed encoding, this didn't limit end-users from any country in the world. But now, I'm convinced it's not the right approach and API: 1. some formats need to output the encoding (like HTML, or XML): we need an encoding parameter, as we can't get it from a Writer instance 2. some formats embed images, like RTF or PDF, then need direct stream access to write binary data Then I think "Sink createSink( Writer writer )" should be removed. Or if we want an API without filename, this method could be transformed into Sink createSink( OutputStream output ) + Sink createSink( OutputStream output, String encoding ). Any objection or idea? Hervé [1] http://svn.apache.org/viewvc/maven/doxia/doxia/trunk/doxia-sink-api/src/main/java/org/apache/maven/doxia/sink/SinkFactory.java?view=markup |
|
|
Re: (output) encoding support in doxia-sink-apiHervé BOUTEMY wrote:
> For the next 1.0-beta-1 version, 2 methods have been added to SinkFactory > interface to improve encoding support: > - Sink createSink( File outputDir, String outputName, String encoding ) Sounds good. This would enable the caller to configure the desired output encoding for text-based sinks like XHTML. > - Sink createSink( Writer writer ) > [...] > 2. some formats embed images, like RTF or PDF, then need direct stream access > to write binary data > > Then I think "Sink createSink( Writer writer )" should be removed. Makes sense, too. One can setup a writer on top of a stream but the reverse is not possible. So it's sensible to not limit the API to writers and support binary-based sinks. It might however be convenient to create an AbstractTextSinkFactory from which all/most text-based sinks could inherit. For instance, XhtmlSinkFactory and XdocSinkFactory look pretty much the same. In more detail, how about public abstract class AbstractTextSinkFactory { /** * @param writer The writer for the sink output, never <code>null</code>. * @param encoding The character encoding used by the writer. */ protected abstract Sink createSink( Writer writer, String encoding ) throws IOException; public Sink createSink( File outputDir, String outputName ) { return createSink( outputDir, outputName, "UTF-8" ); } public Sink createSink( File outputDir, String outputName, String encoding ) { // check args, create out dir, yadayada Writer writer = WriterFactory.newWriter( new File( outputDir, outputName ), encoding ); return createSink( writer, encoding ); } } Based on this, subclasses could be trimmed down to public class XHtmlSinkFactory { protected abstract Sink createSink( Writer writer, String encoding ) { return new XhtmlSink( writer, encoding ); } } The encoding parameter passed in here is apparently only informative. It would merely allow the sinks to determine the encoding, e.g. for the XML declaration. > Or if we want an API without filename, this method could be transformed into > Sink createSink( OutputStream output ) + Sink createSink( OutputStream > output, String encoding ). Unless we actually have need for this (testing with in-memory streams?), I would keep the interface slim to ease its implementation. Another question might be: Does the relation 1 sink -> 1 output file/stream always hold or are there formats that might need to output multiple files? In the later case, createSink( OutputStream ) would be troublesome. Benjamin |
|
|
Re: (output) encoding support in doxia-sink-apiLe vendredi 07 novembre 2008, Benjamin Bentmann a écrit :
> It might however be convenient to create an AbstractTextSinkFactory from > which all/most text-based sinks could inherit. Seems a good idea. This would ensure too that every text-based sinks have the same default encoding: it gives some consistency. Is it too limitating? I don't think. > > Or if we want an API without filename, this method could be transformed > > into Sink createSink( OutputStream output ) + Sink createSink( > > OutputStream output, String encoding ). > > Unless we actually have need for this (testing with in-memory streams?), if we want to generate and zip or transfer on-the-fly, for example. It's only an general idea: I don't really have any plans. I was just guessing that if a method with a Writer was added, some had something in mind. > I would keep the interface slim to ease its implementation. ok with that > Another question might be: Does the relation > 1 sink -> 1 output file/stream > always hold or are there formats that might need to output multiple > files? In the later case, createSink( OutputStream ) would be troublesome. AFAIK, we always have 1 sink -> 1 file/stream but I don't have sufficient experience with Doxia use-cases to affirm that. > > > Benjamin thanks for your comments Hervé |
|
|
Re: (output) encoding support in doxia-sink-apiHi,
2008/11/7 Hervé BOUTEMY <herve.boutemy@...>: > Hi folks, > > For the next 1.0-beta-1 version, 2 methods have been added to SinkFactory > interface to improve encoding support: > - Sink createSink( File outputDir, String outputName, String encoding ) > - Sink createSink( Writer writer ) > See [1] for the full interface. +1 > > I worked with Vincent to implement output encoding in Doxia, and we faced > problems that lead us think that forcing a fixed encoding was the right > approach to have something simple and reliable: with UTF-8 as this fixed > encoding, this didn't limit end-users from any country in the world. > > But now, I'm convinced it's not the right approach and API: > 1. some formats need to output the encoding (like HTML, or XML): we need an > encoding parameter, as we can't get it from a Writer instance see also DOXIA-185 > 2. some formats embed images, like RTF or PDF, then need direct stream access > to write binary data > > Then I think "Sink createSink( Writer writer )" should be removed. I prefer deprecated for backward compatibility issue. > Or if we want an API without filename, this method could be transformed into > Sink createSink( OutputStream output ) + Sink createSink( OutputStream > output, String encoding ). +1 Thanks Hervé for that. Vincent > Any objection or idea? > > Hervé > > [1] > http://svn.apache.org/viewvc/maven/doxia/doxia/trunk/doxia-sink-api/src/main/java/org/apache/maven/doxia/sink/SinkFactory.java?view=markup > |
|
|
Re: (output) encoding support in doxia-sink-apiHi Benjamin
2008/11/7 Benjamin Bentmann <benjamin.bentmann@...>: [SNIP] > It might however be convenient to create an AbstractTextSinkFactory from > which all/most text-based sinks could inherit. For instance, > XhtmlSinkFactory and XdocSinkFactory look pretty much the same. In more > detail, how about +1 too Vincent |
|
|
Re: (output) encoding support in doxia-sink-apiVincent Siveton wrote:
>> Then I think "Sink createSink( Writer writer )" should be removed. > > I prefer deprecated for backward compatibility issue. The source code says "@since 1.0-beta-1", i.e. this method was never part of a released Doxia version, right? Then we should be able to safely removed it. Having both "@since 1.0-beta-1" and "@deprecated since 1.0-beta-1" on a method would be quite strange ;-) Benjamin |
|
|
Re: (output) encoding support in doxia-sink-apiOther comments which maybe related to this thread:
* Parser needs at least 2 parsing: one for macro and another one for processing, and now a third one to validate XML (DOXIA-263) * Sink uses some time StringWriter to play with the writed content and to create a valid content (DOXIA-177) So, maybe it will more easy to have directly String in input/output instead of stream. Cheers, Vincent 2008/11/7 Vincent Siveton <vincent.siveton@...>: > Hi, > > 2008/11/7 Hervé BOUTEMY <herve.boutemy@...>: >> Hi folks, >> >> For the next 1.0-beta-1 version, 2 methods have been added to SinkFactory >> interface to improve encoding support: >> - Sink createSink( File outputDir, String outputName, String encoding ) >> - Sink createSink( Writer writer ) >> See [1] for the full interface. > > +1 > >> >> I worked with Vincent to implement output encoding in Doxia, and we faced >> problems that lead us think that forcing a fixed encoding was the right >> approach to have something simple and reliable: with UTF-8 as this fixed >> encoding, this didn't limit end-users from any country in the world. >> >> But now, I'm convinced it's not the right approach and API: >> 1. some formats need to output the encoding (like HTML, or XML): we need an >> encoding parameter, as we can't get it from a Writer instance > > see also DOXIA-185 > >> 2. some formats embed images, like RTF or PDF, then need direct stream access >> to write binary data >> >> Then I think "Sink createSink( Writer writer )" should be removed. > > I prefer deprecated for backward compatibility issue. > >> Or if we want an API without filename, this method could be transformed into >> Sink createSink( OutputStream output ) + Sink createSink( OutputStream >> output, String encoding ). > > +1 > > Thanks Hervé for that. > > Vincent > >> Any objection or idea? >> >> Hervé >> >> [1] >> http://svn.apache.org/viewvc/maven/doxia/doxia/trunk/doxia-sink-api/src/main/java/org/apache/maven/doxia/sink/SinkFactory.java?view=markup >> > |
|
|
Re: (output) encoding support in doxia-sink-api> > Then I think "Sink createSink( Writer writer )" should be removed.
> > I prefer deprecated for backward compatibility issue. this method was added in 1.0-beta-1-SNAPSHOT: 1.0-beta-1 hasn't been released yet. Is there really any backward compatibility issue here? |
|
|
Re: (output) encoding support in doxia-sink-api2008/11/7 Benjamin Bentmann <benjamin.bentmann@...>:
> Vincent Siveton wrote: > >>> Then I think "Sink createSink( Writer writer )" should be removed. >> >> I prefer deprecated for backward compatibility issue. > > The source code says "@since 1.0-beta-1", i.e. this method was never part of > a released Doxia version, right? Then we should be able to safely removed > it. Having both "@since 1.0-beta-1" and "@deprecated since 1.0-beta-1" on a > method would be quite strange ;-) mmh seems logical :) Vincent > > Benjamin > |
|
|
Re: (output) encoding support in doxia-sink-api> Another question might be: Does the relation
> 1 sink -> 1 output file/stream > always hold or are there formats that might need to output multiple files? > In the later case, createSink( OutputStream ) would be troublesome. The relation should persist. Cheers, Vincent |
| Free embeddable forum powered by Nabble | Forum Help |