« Return to Thread: Shuffle Error: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out.

Re: Shuffle Error: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out.

by beyiwork :: Rate this Message:

Reply to Author | View in Thread

anyone help? so disappointed.

On Fri, Jul 10, 2009 at 4:29 PM, lei wang <nutchmaillist@...> wrote:

> Yes, I am also occuring to  this problem. Can anyone help?
>
>
> On Sun, Jul 5, 2009 at 11:33 PM, xiao yang <yangxiao9901@...> wrote:
>
>> I often get this error message while crawling the intranet
>> Is it the network problem? What can I do for it?
>>
>> $bin/nutch crawl urls -dir crawl -depth 3 -topN 4
>>
>> crawl started in: crawl
>> rootUrlDir = urls
>> threads = 10
>> depth = 3
>> topN = 4
>> Injector: starting
>> Injector: crawlDb: crawl/crawldb
>> Injector: urlDir: urls
>> Injector: Converting injected urls to crawl db entries.
>> Injector: Merging injected urls into crawl db.
>> Injector: done
>> Generator: Selecting best-scoring urls due for fetch.
>> Generator: starting
>> Generator: segment: crawl/segments/20090705212324
>> Generator: filtering: true
>> Generator: topN: 4
>> Generator: Partitioning selected urls by host, for politeness.
>> Shuffle Error: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out.
>> Shuffle Error: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out.
>> Shuffle Error: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out.
>> Exception in thread "main" java.io.IOException: Job failed!
>>    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1232)
>>    at org.apache.nutch.crawl.Generator.generate(Generator.java:524)
>>    at org.apache.nutch.crawl.Generator.generate(Generator.java:409)
>>    at org.apache.nutch.crawl.Crawl.main(Crawl.java:116)
>>
>
>

 « Return to Thread: Shuffle Error: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out.