|
View:
New views
3 Messages
—
Rating Filter:
Alert me
|
|
|
Parsing feeds with wrongly defined namespacesHi,
Some feeds don't have all namespaces defined (maybe due to an error on their site). Rome should still try to parse those feeds and ignore the wrong tags. eg. parsing the following feed http://fr.techcrunch.com/2009/07/21/cest-lete-blog-au-ralenti/feed/ will fail with the following execption as it contains xml code like "<title>Par : <fb:name linked="false" useyou="false" uid="630011441">Jonathan Fischer</fb:name></title>" com.sun.syndication.io.ParsingFeedException: Invalid XML: Error on line 113: The prefix "fb" for element "fb:name" is not bound. at com.sun.syndication.io.WireFeedInput.build(WireFeedInput.java:198) at com.sun.syndication.io.SyndFeedInput.build(SyndFeedInput.java:123) at Test101.main(Test101.java:133) Caused by: org.jdom.input.JDOMParseException: Error on line 113: The prefix "fb" for element "fb:name" is not bound. at org.jdom.input.SAXBuilder.build(SAXBuilder.java:533) at org.jdom.input.SAXBuilder.build(SAXBuilder.java:946) at com.sun.syndication.io.WireFeedInput.build(WireFeedInput.java:194) ... 4 more Caused by: org.xml.sax.SAXParseException: The prefix "fb" for element "fb:name" is not bound. at org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source) at org.apache.xerces.util.ErrorHandlerWrapper.fatalError(Unknown Source) at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source) at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source) at org.apache.xerces.impl.XMLNSDocumentScannerImpl.scanStartElement(Unknown Source) at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source) at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source) at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) at org.apache.xerces.parsers.XMLParser.parse(Unknown Source) at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source) at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source) at org.jdom.input.SAXBuilder.build(SAXBuilder.java:518) ... 6 more Andy ideas on what I can change on the saxbuilder/xerces setup code to ignore non defined namespace tags? Thanks, Thibaut |
|
|
Re: Parsing feeds with wrongly defined namespacesWow, that feed is all kinds of a mess.
Looking at it, if you just need the basic Atom info, you could do: namespaces=false namespace-prefixes=true For stuff like this, and especially *hehe* it being on TechCrunch, emailing the site admin and just telling them to fix it isn't a bad idea either.
On Tue, Aug 4, 2009 at 10:57 AM, Thibaut_ <tbritz@...> wrote:
-- :Robert "kebernet" Cooper ::kebernet@... Alice's cleartext Charlie is the attacker Bob signs and encrypts http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x9E8759F8 |
|
|
Re: Parsing feeds with wrongly defined namespacesThanks,
I will try your suggestions out. It also happens sometimes on the main site, not just the french subsidary. But it's better to handle the problem at the root (eg modifying rome to add dummy namespace declarations), because other sites might do this as well. Thibaut
|
| Free embeddable forum powered by Nabble | Forum Help |