|
View:
New views
4 Messages
—
Rating Filter:
Alert me
|
|
|
RSS autodetectBefore I look into ROME as a possible solution does the feed source have to be a .rss web page or does it autodetect based on the URL given? I need something that can autodetect the rss feed!!
|
|
|
Re: RSS autodetectHi,
I don't exactly understand your question, rome doesn't expects any special URL scheme, so the feed you're reading can also end in .html or .feed or whatever, as long as it's an (more or less) valid feed (rss/rdf/atom), you should be able to read it with Rome. Rome doesn't care about the URL given but about the content delivered from the URL given. Greetings, Martin richiebabes schrieb: > Before I look into ROME as a possible solution does the feed source have to > be a .rss web page or does it autodetect based on the URL given? I need > something that can autodetect the rss feed!! --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@... For additional commands, e-mail: dev-help@... |
|
|
Re: RSS autodetectROME doesn't autodetect RSS feed urls. Jakarta's feedparser had code to do it, but I haven't been able to find feedparser's code anywhere.
On Fri, Jul 3, 2009 at 1:39 PM, richiebabes <rich.g.morgan@...> wrote:
-- Joseph B. Ottinger http://enigmastation.com |
|
|
RE: RSS autodetectHere’s some quick & dirty code to do this. You’ll need tagsoup:
public
static String findRssFeedOnWebpage(String url,
int timeoutMillis) { String rssFeedUrl =
null;
try {
//Make a request to the web page looking for
// <link
ref="alternate" type="application/rss+xml" href="wherever web site is" /> GetMethod getMethod =
new GetMethod(url); getMethod.setRequestHeader("Accept",
"Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8"); getMethod.setRequestHeader("Accept-Charset",
"ISO-8859-1,utf-8;q=0.7,*;q=0.7");
HttpClient httpClient =
new HttpClient(); httpClient.getHttpConnectionManager().getParams().setConnectionTimeout(timeoutMillis); httpClient.getHttpConnectionManager().getParams().setSoTimeout(timeoutMillis);
if (httpClient.executeMethod(getMethod) > 0) {
final SAXException STOP_PARSING =
new SAXException("Found
rss feed, terminating parsing");
final String[] hrefBox =
new String[1]; DefaultHandler rssFinder =
new DefaultHandler() {
@Override
public
void startElement( String uri, String localName, String name, Attributes attributes)
throws SAXException {
if ("link".equals(localName)
&& ("application/rss+xml".equals(attributes.getValue("type"))
|| "application/atom+xml".equals(attributes.getValue("type")))
&&
"alternate".equals(attributes.getValue("rel")))
{
hrefBox[0] = attributes.getValue("href");
throw STOP_PARSING; } }
// tag has to be in head, so once we're done looking in head stop processing
@Override
public
void endElement( String uri, String localName, String name)
throws SAXException {
if ("head".equals(localName))
{
throw STOP_PARSING; } } }; SAXParser sp = SAXParserFactoryImpl.newInstance("org.ccil.cowan.tagsoup.jaxp.SAXFactoryImpl",Thread.currentThread().getContextClassLoader()).newSAXParser(); InputStream is =
new BufferedInputStream(getMethod.getResponseBodyAsStream());
try { sp.parse(is, rssFinder); }
catch (SAXException e) {
if (e != STOP_PARSING) {
//STOP_PARSING is just an
optimisation, nothing to worry about
throw e; } }
finally { is.close(); }
if (hrefBox[0] !=
null) { rssFeedUrl = hrefBox[0];
LOG.debug("Found RSS element
found in " + url +
" of " + rssFeedUrl); }
else {
LOG.debug("No RSS element
found in " + url); } } }
catch (Exception e) {
LOG.warn("Error when trying
to derive RSS feed from " + url, e); }
return rssFeedUrl; }
From: dreamreal@... [mailto:dreamreal@...]
On Behalf Of Joseph Ottinger ROME doesn't autodetect RSS feed urls. Jakarta's feedparser had code to do it, but I haven't been able to find feedparser's code anywhere. On Fri, Jul 3, 2009 at 1:39 PM, richiebabes <rich.g.morgan@...> wrote:
IMPORTANT: This e-mail, including any attachments, may contain private or confidential information. If you think you may not be the intended recipient, or if you have received this e-mail in error, please contact the sender immediately and delete all copies of this e-mail. If you are not the intended recipient, you must not reproduce any part of this e-mail or disclose its contents to any other party. This email represents the views of the individual sender, which do not necessarily reflect those of Education.au except where the sender expressly states otherwise. It is your responsibility to scan this email and any files transmitted with it for viruses or any other defects. education.au limited will not be liable for any loss, damage or consequence caused directly or indirectly by this email. |
| Free embeddable forum powered by Nabble | Forum Help |