|
View:
New views
6 Messages
—
Rating Filter:
Alert me
|
|
|
Woodstox SAX parser, how to ignore DTD declarations?Hello,
I need to parse some documents which may or may not have a DOCTYPE referring to a DTD. I'd like to know how I could tell woodstox SAX parser to ignore these DTD's because it makes my code unusable in production environments which don't have access to internet. Remember, I use a SAX parser (not Stax). SAXParserFactory spf = new com.ctc.wstx.sax.WstxSAXParserFactory(); XMLReader reader = spf.newSAXParser().getXMLReader(); .... Any help welcome! |
|
|
Re: Woodstox SAX parser, how to ignore DTD declarations?On Mon, Apr 27, 2009 at 3:56 AM, stefcl <stefatwork@...> wrote:
> > Hello, > > I need to parse some documents which may or may not have a DOCTYPE referring > to a DTD. Yes, a fairly common use case. > I'd like to know how I could tell woodstox SAX parser to ignore these DTD's > because it makes my code unusable in production environments which don't > have access to internet. > > Remember, I use a SAX parser (not Stax). > > SAXParserFactory spf = new com.ctc.wstx.sax.WstxSAXParserFactory(); > XMLReader reader = spf.newSAXParser().getXMLReader(); > .... > > Any help welcome! One way is to use the usual SAX approach: call setEntityResolver() on Sax Parser, with dummy implementation. That might be best just so that there's nothing Woodstox specific on this. Let me know if that won't work -- there should be a way to get hold of woodstox-specific pieces, to essentially use Stax property configuration. But I need to check out API and/or sources to remember the details. :-) -+ Tatu +- --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email |
|
|
Re: Woodstox SAX parser, how to ignore DTD declarations?First of all, thank you very much for your support...
I tried the following : SAXParserFactory spf = new com.ctc.wstx.sax.WstxSAXParserFactory(); SAXParser parser = spf.newSAXParser(); XMLReader reader = parser.getXMLReader(); reader.setEntityResolver( new EntityResolver() { public InputSource resolveEntity(String publicId, String systemId) throws SAXException, IOException { throw new UnsupportedOperationException("Not supported yet."); } }); But resolveEntity is never called, it looks like the entity resolver is completely ignored... For a different use case, I wrote a piece of code which allows me to quickly read the root element attributes of a given file using the Stax Api, turning off DTD support is all I need to avoid the UnknowHostException I'm getting when a doctype tag is found. I have looked into the code of the SaxParser and it seems like it uses a Stax implementation internally but unfortunately you have no control on its properties, if only I could set its DTD support to false, that would solve everything. You might wonder why I insist on using the sax API instead of Stax which seems a better choice in most scenarios, well the reason is that I use the piece of code above as a SaxSource for the JaxB unmarshaller. The jaxB Api is not flexible at all when it comes to missing namespace/prefix declarations, the Sax Api is the only one which allows me to override the elements URI quite easily using a filter... @Override public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException { if( localName.equals( rootElementName )) { //cheat namespace uri... super.startElement( desiredNamespaceUri , localName, qName, attributes); } else { //do things normally... super.startElement( uri , localName, qName, attributes); } } |
|
|
Re: Woodstox SAX parser, how to ignore DTD declarations?On Tue, Apr 28, 2009 at 12:09 AM, stefcl <stefatwork@...> wrote:
> > First of all, thank you very much for your support... > I tried the following : > > SAXParserFactory spf = new com.ctc.wstx.sax.WstxSAXParserFactory(); > > SAXParser parser = spf.newSAXParser(); > > XMLReader reader = parser.getXMLReader(); > reader.setEntityResolver( new EntityResolver() { > > public InputSource resolveEntity(String publicId, String > systemId) throws SAXException, IOException > { > throw new UnsupportedOperationException("Not supported > yet."); > } > }); > > But resolveEntity is never called, it looks like the entity resolver is > completely ignored... Ok. :-/ Could you file a bug issue at jira (http://jira.codehaus.org/browse/WSTX) for this? It definitely should work, but apparently has not been properly tested. > For a different use case, I wrote a piece of code which allows me to quickly > read the root element attributes of a given file using the Stax Api, turning > off DTD support is all I need to avoid the UnknowHostException I'm getting > when a doctype tag is found. Ok. > I have looked into the code of the SaxParser and it seems like it uses > Stax implementation internally but unfortunately you have no control on its > properties, if only I could set its DTD support to false, that would solve > everything. Right -- let me have a look at that when I get time; there definitely should be a way to access properties, since there's plenty of configurability that would be useful. WstxSAXParserFactory has a package-access reference to mStaxFactory, so you could sub-class it... hmmh. Actually, that really needs to be changed to 'protected' to allow that. I'll look into how to improve this also, along with solving entity resolver problem. > You might wonder why I insist on using the sax API instead of Stax which > seems a better choice in most scenarios, well the reason is that I use the > piece of code above as a SaxSource for the JaxB unmarshaller. The jaxB Api > is not flexible at all when it comes to missing namespace/prefix > declarations, the Sax Api is the only one which allows me to override the > elements URI quite easily using a filter... Makes sense. Sometimes SAX is useful, both for legacy reasons, and for deep pipelining cases. And given how easy it is to add SAX event generation on top of Stax, it's not a big deal to support both APIs. Actually, I think "namespace correction" (or work-around) is a good example of kind of pipelining for which SAX works quite nicely. -+ Tatu +- --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email |
|
|
Re: Woodstox SAX parser, how to ignore DTD declarations? |
|
|
Re: Woodstox SAX parser, how to ignore DTD declarations?Thanks. I will look into it.
-+ Tatu + On Wed, Apr 29, 2009 at 1:19 AM, stefcl <stefatwork@...> wrote: > > It's done : > http://jira.codehaus.org/browse/WSTX-204 > http://jira.codehaus.org/browse/WSTX-204 > > > -- > View this message in context: http://www.nabble.com/Woodstox-SAX-parser%2C-how-to-ignore-DTD-declarations--tp23253867p23292712.html > Sent from the woodstox - user mailing list archive at Nabble.com. > > > --------------------------------------------------------------------- > To unsubscribe from this list, please visit: > > http://xircles.codehaus.org/manage_email > > > --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email |
| Free embeddable forum powered by Nabble | Forum Help |