what's the relationship between nutch, solr, lucene, and hadoop

View: New views
2 Messages — Rating Filter:   Alert me  

what's the relationship between nutch, solr, lucene, and hadoop

by Xiao Yang :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi, guys,

I'm quite confused with them.
It seems that nutch contains solr, lucene, and hadoop. But I'm not quite sure.
What roles are they playing for a search engine?
I know lucene is for index, hadoop is for storage. What about the
nutch and solr?

Thanks!
Xiao

Re: what's the relationship between nutch, solr, lucene, and hadoop

by johan.sjoberg :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Lucene is the index library.
Solr is the interface to use the lucene index.
Hadoop is used to perform tasks distributed.
Nutch fetches e.g. web pages (by using hadoop distribution) and may insert
them into the lucene index by using the Solr.

I hope this was clear/correct enough

Regards Johan S.


> Hi, guys,
>
> I'm quite confused with them.
> It seems that nutch contains solr, lucene, and hadoop. But I'm not quite
> sure.
> What roles are they playing for a search engine?
> I know lucene is for index, hadoop is for storage. What about the
> nutch and solr?
>
> Thanks!
> Xiao
>