|
View:
New views
8 Messages
—
Rating Filter:
Alert me
|
|
|
LexicalHandler for DocumentBuilder?I want to be able to resolve unknown entities while using
DocumentBuilder.parse. I see sax has a LexicalHandler for this purpose, and i'd assume that there's some way to tell DOM to pass a handler into the underlying sax parser that builds the dom, but i haven't found it. Could someone point me in the right direction? thanks. --------------------------------------------------------------------- To unsubscribe, e-mail: j-users-unsubscribe@... For additional commands, e-mail: j-users-help@... |
|
|
Re: LexicalHandler for DocumentBuilder?Hi, |
|
|
Re: LexicalHandler for DocumentBuilder?Thanks for the response. If i understand you correctly, i should do this:
XMLReader r = XMLReaderFactory.createXMLReader(); MyXMLFilter filter = new MyXMLFilter(); filter.setParent(r); DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); DocumentBuilder db = dbf.newDocumentBuilder(); Document d = db.newDocument(); TransformerFactory tf = TransformerFactory.newInstance(); Transformer t = tf.newTransformer(); t.transform(new SAXSource(filter, new InputSource(srcHtml)), new DOMResult(d)); and i should expect the skippedEntity(String entity) method of my filter to get called when a is found. However this method is never called for me. Am I missing something? Michael Glavassevich wrote: > > Hi, > > You're making assumptions about the implementation which aren't > required and certainly aren't true for Xerces. There is no underlying > SAX parser. The DOM is built from XNI events. You cannot plug SAX > handlers into it. > > If you want to build a SAX filter for replacing skipped entities [1] > and then build a DOM from that, you could try using the Transformer > API instead (i.e. javax.xml.transform.Transformer.transform(SAXSource, > DOMResult)) where the SAXSource contains your XMLFilter [2] which does > this resolution. > > Thanks. > > [1] > http://xerces.apache.org/xerces2-j/javadocs/api/org/xml/sax/ContentHandler.html#skippedEntity(java.lang.String) > <http://xerces.apache.org/xerces2-j/javadocs/api/org/xml/sax/ContentHandler.html#skippedEntity%28java.lang.String%29> > [2] > http://xerces.apache.org/xerces2-j/javadocs/api/org/xml/sax/XMLFilter.html > > Michael Glavassevich > XML Parser Development > IBM Toronto Lab > E-mail: mrglavas@... > E-mail: mrglavas@... > > "dbrosius@..." <dbrosius@...> wrote on > 11/08/2009 12:52:53 PM: > > > I want to be able to resolve unknown entities while using > > DocumentBuilder.parse. I see sax has a LexicalHandler for this purpose, > > and i'd assume that there's some way to tell DOM to pass a handler into > > the underlying sax parser that builds the dom, but i haven't found it. > > Could someone point me in the right direction? > > > > thanks. > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: j-users-unsubscribe@... > > For additional commands, e-mail: j-users-help@... > --------------------------------------------------------------------- To unsubscribe, e-mail: j-users-unsubscribe@... For additional commands, e-mail: j-users-help@... |
|
|
Re: LexicalHandler for DocumentBuilder?"dbrosius@..." <dbrosius@...> wrote on 11/08/2009 08:21:21 PM: |
|
|
Re: LexicalHandler for DocumentBuilder?Your assumptions are correct. the nbsp is not in the dtd. I get
javax.xml.transform.TransformerException: org.xml.sax.SAXParseException: The entity "nbsp" was referenced, but not declared. at org.apache.xalan.transformer.TransformerIdentityImpl.transform(TransformerIdentityImpl.java:502) I am using xerces 2.9.1 and xalan 2.7.1 When i do XMLReader r = XMLReaderFactory.createXMLReader(); URL u = r.getClass().getClassLoader().getResource(r.getClass().getName().replace('.', '/') + ".class"); System.out.println(u);namespace aware or i get jar:file:/home/dave/dev/tomailer/war/WEB-INF/lib/xercesImpl.jar!/org/apache/xerces/parsers/SAXParser.class which is the jar i'm expecting to get the parser from. the is in PCDATA. Do i need to specify that the parser is validating in order to get this to work, perhaps? thanks again. dave Michael Glavassevich wrote: > > "dbrosius@..." <dbrosius@...> wrote on > 11/08/2009 08:21:21 PM: > > > Thanks for the response. If i understand you correctly, i should do > this: > > > > XMLReader r = XMLReaderFactory.createXMLReader(); > > MyXMLFilter filter = new MyXMLFilter(); > > filter.setParent(r); > > > > DocumentBuilderFactory dbf = > > DocumentBuilderFactory.newInstance(); > > DocumentBuilder db = dbf.newDocumentBuilder(); > > Document d = db.newDocument(); > > > > TransformerFactory tf = TransformerFactory.newInstance(); > > Transformer t = tf.newTransformer(); > > > > t.transform(new SAXSource(filter, new > InputSource(srcHtml)), > > new DOMResult(d)); > > Looks about right. > > > and i should expect the > > > > skippedEntity(String entity) > > > > method of my filter to get called when a is found. > > > > However this method is never called for me. Am I missing something? > > Is "nbsp" declared in your DTD? You said your goal was to "resolve > unknown entities" so I'm assuming it's not declared. > > Is this entity reference part of an attribute value? If it is you're > out of luck with SAX. skippedEntity() as well as the entity methods on > LexicalHandler are never called for attributes. > > Also can you double check that you're actually using a recent release > of Xerces-J and not its JDK fork or NekoHTML or something else. > > > Michael Glavassevich wrote: > > > > > > Hi, > > > > > > You're making assumptions about the implementation which aren't > > > required and certainly aren't true for Xerces. There is no underlying > > > SAX parser. The DOM is built from XNI events. You cannot plug SAX > > > handlers into it. > > > > > > If you want to build a SAX filter for replacing skipped entities [1] > > > and then build a DOM from that, you could try using the Transformer > > > API instead (i.e. > javax.xml.transform.Transformer.transform(SAXSource, > > > DOMResult)) where the SAXSource contains your XMLFilter [2] which > does > > > this resolution. > > > > > > Thanks. > > > > > > [1] > > > http://xerces.apache.org/xerces2-j/javadocs/api/org/xml/sax/ > > ContentHandler.html#skippedEntity(java.lang.String) > > > <http://xerces.apache.org/xerces2-j/javadocs/api/org/xml/sax/ > > ContentHandler.html#skippedEntity%28java.lang.String%29> > > > [2] > > > > http://xerces.apache.org/xerces2-j/javadocs/api/org/xml/sax/XMLFilter.html > > > > > > Michael Glavassevich > > > XML Parser Development > > > IBM Toronto Lab > > > E-mail: mrglavas@... > > > E-mail: mrglavas@... > > > > > > "dbrosius@..." <dbrosius@...> wrote on > > > 11/08/2009 12:52:53 PM: > > > > > > > I want to be able to resolve unknown entities while using > > > > DocumentBuilder.parse. I see sax has a LexicalHandler for this > purpose, > > > > and i'd assume that there's some way to tell DOM to pass a > handler into > > > > the underlying sax parser that builds the dom, but i haven't > found it. > > > > Could someone point me in the right direction? > > > > > > > > thanks. > > > > > > > > > --------------------------------------------------------------------- > > > > To unsubscribe, e-mail: j-users-unsubscribe@... > > > > For additional commands, e-mail: j-users-help@... > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: j-users-unsubscribe@... > > For additional commands, e-mail: j-users-help@... > > Thanks. > > Michael Glavassevich > XML Parser Development > IBM Toronto Lab > E-mail: mrglavas@... > E-mail: mrglavas@... > --------------------------------------------------------------------- To unsubscribe, e-mail: j-users-unsubscribe@... For additional commands, e-mail: j-users-help@... |
|
|
Re: LexicalHandler for DocumentBuilder?Dave, |
|
|
Re: LexicalHandler for DocumentBuilder?Thanks for the help... I realize now that i didn't understand what was
going on. The documents in question had DOCTYPEs for transitional etc. But I was getting 503's because w3 rejects requests for these dtds. So i now am caching them locally and using a DOM EntityResolver to provide those directly,and now nbsp is defined. But i learned some new stuff with this email exchange, and i appreciate the support. thanks. dave Michael Glavassevich wrote: > > Dave, > > What I had said only applies when the undeclared entity is not > violating [1] a well-formedness constraint [2]. Otherwise you'll get a > fatal error. > > Thanks. > > [1] http://www.w3.org/TR/2006/REC-xml-20060816/#vc-entdeclared > [2] http://www.w3.org/TR/2006/REC-xml-20060816/#wf-entdeclared > > Michael Glavassevich > XML Parser Development > IBM Toronto Lab > E-mail: mrglavas@... > E-mail: mrglavas@... > > Dave Brosius <dbrosius@...> wrote on 11/09/2009 01:13:42 AM: > > > Your assumptions are correct. the nbsp is not in the dtd. I get > > > > javax.xml.transform.TransformerException: > org.xml.sax.SAXParseException: > > The entity "nbsp" was referenced, but not declared. at > > org.apache.xalan.transformer.TransformerIdentityImpl.transform > > (TransformerIdentityImpl.java:502) > > > > I am using xerces 2.9.1 and xalan 2.7.1 > > > > When i do > > > > XMLReader r = XMLReaderFactory.createXMLReader(); > > URL u = > > > r.getClass().getClassLoader().getResource(r.getClass().getName().replace('.', > > > '/') + ".class"); > > System.out.println(u);namespace aware or > > > > i get > > > > > > jar:file:/home/dave/dev/tomailer/war/WEB-INF/lib/xercesImpl.jar!/ > > org/apache/xerces/parsers/SAXParser.class > > > > which is the jar i'm expecting to get the parser from. > > > > the is in PCDATA. > > > > Do i need to specify that the parser is validating in order to get this > > to work, perhaps? > > > > thanks again. > > dave > > > > > > > > Michael Glavassevich wrote: > > > > > > "dbrosius@..." <dbrosius@...> wrote on > > > 11/08/2009 08:21:21 PM: > > > > > > > Thanks for the response. If i understand you correctly, i should do > > > this: > > > > > > > > XMLReader r = XMLReaderFactory.createXMLReader(); > > > > MyXMLFilter filter = new MyXMLFilter(); > > > > filter.setParent(r); > > > > > > > > DocumentBuilderFactory dbf = > > > > DocumentBuilderFactory.newInstance(); > > > > DocumentBuilder db = dbf.newDocumentBuilder(); > > > > Document d = db.newDocument(); > > > > > > > > TransformerFactory tf = > TransformerFactory.newInstance(); > > > > Transformer t = tf.newTransformer(); > > > > > > > > t.transform(new SAXSource(filter, new > > > InputSource(srcHtml)), > > > > new DOMResult(d)); > > > > > > Looks about right. > > > > > > > and i should expect the > > > > > > > > skippedEntity(String entity) > > > > > > > > method of my filter to get called when a is found. > > > > > > > > However this method is never called for me. Am I missing something? > > > > > > Is "nbsp" declared in your DTD? You said your goal was to "resolve > > > unknown entities" so I'm assuming it's not declared. > > > > > > Is this entity reference part of an attribute value? If it is you're > > > out of luck with SAX. skippedEntity() as well as the entity > methods on > > > LexicalHandler are never called for attributes. > > > > > > Also can you double check that you're actually using a recent release > > > of Xerces-J and not its JDK fork or NekoHTML or something else. > > > > > > > Michael Glavassevich wrote: > > > > > > > > > > Hi, > > > > > > > > > > You're making assumptions about the implementation which aren't > > > > > required and certainly aren't true for Xerces. There is no > underlying > > > > > SAX parser. The DOM is built from XNI events. You cannot plug SAX > > > > > handlers into it. > > > > > > > > > > If you want to build a SAX filter for replacing skipped > entities [1] > > > > > and then build a DOM from that, you could try using the > Transformer > > > > > API instead (i.e. > > > javax.xml.transform.Transformer.transform(SAXSource, > > > > > DOMResult)) where the SAXSource contains your XMLFilter [2] which > > > does > > > > > this resolution. > > > > > > > > > > Thanks. > > > > > > > > > > [1] > > > > > http://xerces.apache.org/xerces2-j/javadocs/api/org/xml/sax/ > > > > ContentHandler.html#skippedEntity(java.lang.String) > > > > > <http://xerces.apache.org/xerces2-j/javadocs/api/org/xml/sax/ > > > > ContentHandler.html#skippedEntity%28java.lang.String%29> > > > > > [2] > > > > > > > > > http://xerces.apache.org/xerces2-j/javadocs/api/org/xml/sax/XMLFilter.html > > > > > > > > > > Michael Glavassevich > > > > > XML Parser Development > > > > > IBM Toronto Lab > > > > > E-mail: mrglavas@... > > > > > E-mail: mrglavas@... > > > > > > > > > > "dbrosius@..." <dbrosius@...> wrote on > > > > > 11/08/2009 12:52:53 PM: > > > > > > > > > > > I want to be able to resolve unknown entities while using > > > > > > DocumentBuilder.parse. I see sax has a LexicalHandler for this > > > purpose, > > > > > > and i'd assume that there's some way to tell DOM to pass a > > > handler into > > > > > > the underlying sax parser that builds the dom, but i haven't > > > found it. > > > > > > Could someone point me in the right direction? > > > > > > > > > > > > thanks. > > > > > > > > > > > > > > > --------------------------------------------------------------------- > > > > > > To unsubscribe, e-mail: j-users-unsubscribe@... > > > > > > For additional commands, e-mail: j-users-help@... > > > > > > > > > > > > > > --------------------------------------------------------------------- > > > > To unsubscribe, e-mail: j-users-unsubscribe@... > > > > For additional commands, e-mail: j-users-help@... > > > > > > Thanks. > > > > > > Michael Glavassevich > > > XML Parser Development > > > IBM Toronto Lab > > > E-mail: mrglavas@... > > > E-mail: mrglavas@... > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: j-users-unsubscribe@... > > For additional commands, e-mail: j-users-help@... > --------------------------------------------------------------------- To unsubscribe, e-mail: j-users-unsubscribe@... For additional commands, e-mail: j-users-help@... |
|
|
Re: LexicalHandler for DocumentBuilder?Dave Brosius <dbrosius@...> wrote on 11/10/2009 08:30:37 AM: |
| Free embeddable forum powered by Nabble | Forum Help |