[Nutch Wiki] Update of "IntranetRecrawl" by susam

View: New views
1 Messages — Rating Filter:   Alert me  

[Nutch Wiki] Update of "IntranetRecrawl" by susam

by Apache Wiki :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.

The following page has been changed by susam:
http://wiki.apache.org/nutch/IntranetRecrawl

The comment on the change is:
Link to crawl script for Nutch 1.0

------------------------------------------------------------------------------
  }}}
 
  == Version 1.0 ==
- A crawl script that runs properly with bash and has been tested with Nutch 1.0 can be found here: Crawl
+ A crawl script that runs properly with bash and has been tested with Nutch 1.0 can be found here: Self:Crawl. This script can do crawl as well as recrawl. However, not much real world recrawl has been done with this script. It might require a little bit of tweaking if you find that the script does not suit your needs.