Setting the User-Agent field for RSS GET requests

View: New views
1 Messages — Rating Filter:   Alert me  

Setting the User-Agent field for RSS GET requests

by Chris Croome :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi

Google News blocks wget and LWP from making RSS requests -- try getting
these feeds in a browser and wget:

  http://news.google.com/news?q=sheffield&ie=UTF-8&output=rss

The User-Agent field can be set for LWP:Simple [1] and I guess having a
env var for this would make sense, and it could be set to something
sensible by default, eg "MKDoc 1.6 RSS fetcher" and then it would also
be easy to change it's if this doesn't work for some sites...

For now adding a line like this to tools/cron/031..rss_routine.pl and
tools/cron/030..rss_troubleshooter.pl does the trick

  $ua->agent('Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)');

Chris

[1] http://search.cpan.org/~gaas/libwww-perl-5.805/lib/LWP/UserAgent.pm

--
Chris Croome                               <chris@...>
web design                             http://www.webarchitects.co.uk/ 
web content management                               http://mkdoc.com/   
_______________________________________________
MKDoc-dev mailing list
MKDoc-dev@...
https://lists.webarch.co.uk/mailman/listinfo/mkdoc-dev