|
View:
New views
3 Messages
—
Rating Filter:
Alert me
|
|
|
Nutch/Solr questionHi,
I want to make site search for few of my (and friends) websites but without access to database data. So using nutch crawling and then I have 2 ways. 1. index data to solr 2. leave it with nutch index I need help in finding advantages/disadvantages of solr vs nutch searching because I don't know solr (it's hard to have a big picture) Each site is quite small so it can be held by solr with no problems. In solr I probably can't use faceted search or range queries etc. because I don't have necessary data in schema? In nutch I can have one search server and use site:domain to limit results (like google site search) or use multiple indexes (mentioned on mailing list) but what with solr? Any input highly appreciated. Thanks, Bartosz |
|
|
Re: Nutch/Solr questionHi,
I have the same problem, i am using Nutch but thinking about using it with Solr. I configured the whole Solr and now i am trying to configure nutch to work with solr. Like you i have no previous experience with Solr so i used a bunch of tutorials. I run a XP and a Linux Ubuntu version on my system and i only configured nuth/solr for xp so far. An i run a server with ubuntu so i also might want to configure solr/nutch for ubuntu. Only crawl about 10 websites(almost like you) and intend to use the results as a search engine for friends and colleague's. Like you want to know what work better, just nutch or in combination with solr. These links really helped me out: http://wiki.apache.org/nutch/GettingNutchRunningWithWindows http://wiki.apache.org/nutch/GettingNutchRunningWithUbuntu http://wiki.apache.org/nutch/RunningNutchAndSolr We might be able to help each other out if you have more questions/sugguestions. > Hi, > > I want to make site search for few of my (and friends) websites but > without access to database data. So using nutch crawling and then I > have 2 ways. > 1. index data to solr > 2. leave it with nutch index > > I need help in finding advantages/disadvantages of solr vs nutch > searching because I don't know solr (it's hard to have a big picture) > > Each site is quite small so it can be held by solr with no problems. > In solr I probably can't use faceted search or range queries etc. > because I don't have necessary data in schema? > > In nutch I can have one search server and use site:domain to limit > results (like google site search) or use multiple indexes (mentioned > on mailing list) but what with solr? > > Any input highly appreciated. > > Thanks, > Bartosz > > > __________ Information from ESET NOD32 Antivirus, version of virus > signature database 4574 (20091104) __________ > > The message was checked by ESET NOD32 Antivirus. > > http://www.eset.com > > > > __________ Information from ESET NOD32 Antivirus, version of virus signature database 4574 (20091104) __________ The message was checked by ESET NOD32 Antivirus. http://www.eset.com |
|
|
Re: Nutch/Solr questionSolr is just a search and indexing server. It doesn't do crawling. Nutch does the crawling and page parsing, and can index into Lucene or into a Solr server.
Nutch is a biggish beast, and if you just need to index a site or even a small set of them, you may have an easier time with Droids. Otis -- Sematext is hiring -- http://sematext.com/about/jobs.html?mls Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR ----- Original Message ---- > From: Bartosz Gadzimski <bartek--g@...> > To: nutch-user@... > Sent: Wed, November 4, 2009 10:41:14 AM > Subject: Nutch/Solr question > > Hi, > > I want to make site search for few of my (and friends) websites but without > access to database data. So using nutch crawling and then I have 2 ways. > 1. index data to solr > 2. leave it with nutch index > > I need help in finding advantages/disadvantages of solr vs nutch searching > because I don't know solr (it's hard to have a big picture) > > Each site is quite small so it can be held by solr with no problems. > In solr I probably can't use faceted search or range queries etc. because I > don't have necessary data in schema? > > In nutch I can have one search server and use site:domain to limit results (like > google site search) or use multiple indexes (mentioned on mailing list) but what > with solr? > > Any input highly appreciated. > > Thanks, > Bartosz |
| Free embeddable forum powered by Nabble | Forum Help |