« Return to Thread: Nutch 1.0 on the limits of the data

Re: Nutch 1.0 on the limits of the data

by Dennis Kubes-2 :: Rate this Message:

Reply to Author | View in Thread

Simple answer is billions, perhaps tens to hundreds of billions of
records, as it leverages Hadoop.  Yahoo is currently using Hadoop to
create its web index.  But as Otis pointed out, Hadoop is parallel
processing and as such is completely dependent on amount of hardware.

Dennis

Polsnet wrote:
> Nutch 1.0 largest number of data can support? (File size or number of
> records)

 « Return to Thread: Nutch 1.0 on the limits of the data