« Return to Thread: Plugins: when to perform web service requests, on fetch or on index?

Re: Plugins: when to perform web service requests, on fetch or on index?

by Stefan Dlugolinsky :: Rate this Message:

Reply to Author | View in Thread

Hello,

I don't know how v 1.0 differs from v 0.9, but in v 0.9, I would do
those service requests in the stage of indexation (extension point
IndexingFilter), where you have several data prepared from previous
stage (by parsers, etc.), so you can use this data in the requests.
But it depends on what you exactly want, whether you want to use
parsed data in the requests. If not, you can call webservice requests
earlier from parsing stage (extension point Parse).

Here is something about core extension points:
http://wiki.apache.org/nutch/AboutPlugins

Steve

2009/6/18 caezar <caezaris@...>:

>
> Hi All,
>
> I'm writing several nutch plugins, which will perform a requests to some
> webservices for pages being indexed and store retrieved data in index. The
> question is: on what stage of crawling it is better to perform these
> webservice requests: on fetching or on indexing (in HtmlParseFilter or in
> IndexingFilter), in terms of performance, of course?
>
> Nutch version is 1.0, indexer is SolrIndexer.
>
> Thanks.
> --
> View this message in context: http://www.nabble.com/Plugins%3A-when-to-perform-web-service-requests%2C-on-fetch-or-on-index--tp24089858p24089858.html
> Sent from the Nutch - Dev mailing list archive at Nabble.com.
>
>

 « Return to Thread: Plugins: when to perform web service requests, on fetch or on index?