|
View:
New views
20 Messages
—
Rating Filter:
Alert me
|
|
|
RE: CPU utilization and query time high on Solr slave when snapshot installHi Solr Gurus,
We have solr in 1 master, 2 slave configuration. Snapshot is created post commit, post optimization. We have autocommit after 50 documents or 5 minutes. Snapshot puller runs as a cron every 10 minutes. What we have observed is that whenever snapshot is installed on the slave, we see solrj client used to query slave solr, gets timedout and there is high CPU usage/load avg. on slave server. If we stop snapshot puller, then slaves work with no issues. The system has been running since 2 months and this issue has started to occur only now when load on website is increasing. Following are some details: Solr Details: apache-solr Version: 1.3.0 Lucene - 2.4-dev Master/Slave configurations: Master: - for indexing data HTTPRequests are made on Solr server. - autocommit feature is enabled for 50 docs and 5 minutes - caching params are disable for this server - mergeFactor of 10 is set - we were running optimize script after every 2 hours, but now have reduced the duration to twice a day but issue still persists Slave1/Slave2: - standard requestHandler is being used - default values of caching are set Machine Specifications: Master: - 4GB RAM - 1GB JVM Heap memory is allocated to Solr Slave1/Slave2: - 4GB RAM - 2GB JVM Heap memory is allocated to Solr Master and Slave1 (solr1)are on single box and Slave2(solr2) on different box. We use HAProxy to load balance query requests between 2 slaves. Master is only used for indexing. Please let us know if somebody has ever faced similar kind of issue or has some insight into it as we guys are literally struck at the moment with a very unstable production environment. As a workaround, we have started running optimize on master every 7 minutes. This seems to have reduced the severity of the problem but still issue occurs every 2days now. please suggest what could be the root cause of this. Thanks, Bipul |
|
|
Re: CPU utilization and query time high on Solr slave when snapshot installIf you are going to pull a new index every 10 minutes, try turning off
cache autowarming. Your caches are never more than 10 minutes old, so spending a minute warming each new cache is a waste of CPU. Autowarm submits queries to the new Searcher before putting it in service. This will create a burst of query load on the new Searcher, often keeping one CPU pretty busy for several seconds. In solrconfig.xml, set autowarmCount to 0. Also, if you want the slaves to always have an optimized index, create the snapshot only in post-optimize. If you create snapshots in both post-commit and post-optimize, you are creating a non-optimized index (post-commit), then replacing it with an optimized one a few minutes later. A slave might get a non-optimized index one time, then an optimized one the next. wunder On Nov 2, 2009, at 1:45 AM, bikumar@... wrote: > Hi Solr Gurus, > > We have solr in 1 master, 2 slave configuration. Snapshot is created > post commit, post optimization. We have autocommit after 50 > documents or 5 minutes. Snapshot puller runs as a cron every 10 > minutes. What we have observed is that whenever snapshot is > installed on the slave, we see solrj client used to query slave > solr, gets timedout and there is high CPU usage/load avg. on slave > server. If we stop snapshot puller, then slaves work with no issues. > The system has been running since 2 months and this issue has > started to occur only now when load on website is increasing. > > Following are some details: > > Solr Details: > apache-solr Version: 1.3.0 > Lucene - 2.4-dev > > Master/Slave configurations: > > Master: > - for indexing data HTTPRequests are made on Solr server. > - autocommit feature is enabled for 50 docs and 5 minutes > - caching params are disable for this server > - mergeFactor of 10 is set > - we were running optimize script after every 2 hours, but now have > reduced the duration to twice a day but issue still persists > > Slave1/Slave2: > - standard requestHandler is being used > - default values of caching are set > Machine Specifications: > > Master: > - 4GB RAM > - 1GB JVM Heap memory is allocated to Solr > > Slave1/Slave2: > - 4GB RAM > - 2GB JVM Heap memory is allocated to Solr > > Master and Slave1 (solr1)are on single box and Slave2(solr2) on > different box. We use HAProxy to load balance query requests between > 2 slaves. Master is only used for indexing. > Please let us know if somebody has ever faced similar kind of issue > or has some insight into it as we guys are literally struck at the > moment with a very unstable production environment. > > As a workaround, we have started running optimize on master every 7 > minutes. This seems to have reduced the severity of the problem but > still issue occurs every 2days now. please suggest what could be the > root cause of this. > > Thanks, > Bipul > > > > |
|
|
Re: CPU utilization and query time high on Solr slave when snapshot installHmm...I think you have to setup warming queries yourself and that
autowarm just copies entries from the old cache to the new cache, rather than issuing queries - the value is how many entries it will copy. Though that's still going to take CPU and time. - Mark http://www.lucidimagination.com (mobile) On Nov 2, 2009, at 12:47 PM, Walter Underwood <wunder@...> wrote: > If you are going to pull a new index every 10 minutes, try turning > off cache autowarming. > > Your caches are never more than 10 minutes old, so spending a minute > warming each new cache is a waste of CPU. Autowarm submits queries > to the new Searcher before putting it in service. This will create a > burst of query load on the new Searcher, often keeping one CPU > pretty busy for several seconds. > > In solrconfig.xml, set autowarmCount to 0. > > Also, if you want the slaves to always have an optimized index, > create the snapshot only in post-optimize. If you create snapshots > in both post-commit and post-optimize, you are creating a non- > optimized index (post-commit), then replacing it with an optimized > one a few minutes later. A slave might get a non-optimized index one > time, then an optimized one the next. > > wunder > > On Nov 2, 2009, at 1:45 AM, bikumar@... wrote: > >> Hi Solr Gurus, >> >> We have solr in 1 master, 2 slave configuration. Snapshot is >> created post commit, post optimization. We have autocommit after 50 >> documents or 5 minutes. Snapshot puller runs as a cron every 10 >> minutes. What we have observed is that whenever snapshot is >> installed on the slave, we see solrj client used to query slave >> solr, gets timedout and there is high CPU usage/load avg. on slave >> server. If we stop snapshot puller, then slaves work with no >> issues. The system has been running since 2 months and this issue >> has started to occur only now when load on website is increasing. >> >> Following are some details: >> >> Solr Details: >> apache-solr Version: 1.3.0 >> Lucene - 2.4-dev >> >> Master/Slave configurations: >> >> Master: >> - for indexing data HTTPRequests are made on Solr server. >> - autocommit feature is enabled for 50 docs and 5 minutes >> - caching params are disable for this server >> - mergeFactor of 10 is set >> - we were running optimize script after every 2 hours, but now have >> reduced the duration to twice a day but issue still persists >> >> Slave1/Slave2: >> - standard requestHandler is being used >> - default values of caching are set >> Machine Specifications: >> >> Master: >> - 4GB RAM >> - 1GB JVM Heap memory is allocated to Solr >> >> Slave1/Slave2: >> - 4GB RAM >> - 2GB JVM Heap memory is allocated to Solr >> >> Master and Slave1 (solr1)are on single box and Slave2(solr2) on >> different box. We use HAProxy to load balance query requests >> between 2 slaves. Master is only used for indexing. >> Please let us know if somebody has ever faced similar kind of issue >> or has some insight into it as we guys are literally struck at the >> moment with a very unstable production environment. >> >> As a workaround, we have started running optimize on master every 7 >> minutes. This seems to have reduced the severity of the problem but >> still issue occurs every 2days now. please suggest what could be >> the root cause of this. >> >> Thanks, >> Bipul >> >> >> >> > |
|
|
Re: CPU utilization and query time high on Solr slave when snapshot installSo assuming you set up a few sample sort queries to run in the firstSearcher
config, and had very low query volume during that ten minutes so that there were no evictions before a new Searcher was loaded, would those queries run by the firstSearcher be passed along to the cache for the next Searcher as part of the autowarm? If so, it seems like you might want to load a few sort queries for the firstSearcher, but might not need any included in the newSearcher? -Jay On Mon, Nov 2, 2009 at 4:26 PM, Mark Miller <markrmiller@...> wrote: > Hmm...I think you have to setup warming queries yourself and that autowarm > just copies entries from the old cache to the new cache, rather than issuing > queries - the value is how many entries it will copy. Though that's still > going to take CPU and time. > > - Mark > > http://www.lucidimagination.com (mobile) > > > On Nov 2, 2009, at 12:47 PM, Walter Underwood <wunder@...> > wrote: > > If you are going to pull a new index every 10 minutes, try turning off >> cache autowarming. >> >> Your caches are never more than 10 minutes old, so spending a minute >> warming each new cache is a waste of CPU. Autowarm submits queries to the >> new Searcher before putting it in service. This will create a burst of query >> load on the new Searcher, often keeping one CPU pretty busy for several >> seconds. >> >> In solrconfig.xml, set autowarmCount to 0. >> >> Also, if you want the slaves to always have an optimized index, create the >> snapshot only in post-optimize. If you create snapshots in both post-commit >> and post-optimize, you are creating a non-optimized index (post-commit), >> then replacing it with an optimized one a few minutes later. A slave might >> get a non-optimized index one time, then an optimized one the next. >> >> wunder >> >> On Nov 2, 2009, at 1:45 AM, bikumar@... wrote: >> >> Hi Solr Gurus, >>> >>> We have solr in 1 master, 2 slave configuration. Snapshot is created post >>> commit, post optimization. We have autocommit after 50 documents or 5 >>> minutes. Snapshot puller runs as a cron every 10 minutes. What we have >>> observed is that whenever snapshot is installed on the slave, we see solrj >>> client used to query slave solr, gets timedout and there is high CPU >>> usage/load avg. on slave server. If we stop snapshot puller, then slaves >>> work with no issues. The system has been running since 2 months and this >>> issue has started to occur only now when load on website is increasing. >>> >>> Following are some details: >>> >>> Solr Details: >>> apache-solr Version: 1.3.0 >>> Lucene - 2.4-dev >>> >>> Master/Slave configurations: >>> >>> Master: >>> - for indexing data HTTPRequests are made on Solr server. >>> - autocommit feature is enabled for 50 docs and 5 minutes >>> - caching params are disable for this server >>> - mergeFactor of 10 is set >>> - we were running optimize script after every 2 hours, but now have >>> reduced the duration to twice a day but issue still persists >>> >>> Slave1/Slave2: >>> - standard requestHandler is being used >>> - default values of caching are set >>> Machine Specifications: >>> >>> Master: >>> - 4GB RAM >>> - 1GB JVM Heap memory is allocated to Solr >>> >>> Slave1/Slave2: >>> - 4GB RAM >>> - 2GB JVM Heap memory is allocated to Solr >>> >>> Master and Slave1 (solr1)are on single box and Slave2(solr2) on different >>> box. We use HAProxy to load balance query requests between 2 slaves. Master >>> is only used for indexing. >>> Please let us know if somebody has ever faced similar kind of issue or >>> has some insight into it as we guys are literally struck at the moment with >>> a very unstable production environment. >>> >>> As a workaround, we have started running optimize on master every 7 >>> minutes. This seems to have reduced the severity of the problem but still >>> issue occurs every 2days now. please suggest what could be the root cause of >>> this. >>> >>> Thanks, >>> Bipul >>> >>> >>> >>> >>> >> |
|
|
RE: CPU utilization and query time high on Solr slave when snapshot installHi Walter,
When the issue occurred, we did try to set autowarming off, but it did not solve the problem. The only thing which worked, was optimizing the slave index. But, what you say is logical and I will try it again. But, the basic question I have is, our solr index is not huge by any means. Secondly, I have read in wiki etc. that optmize has adverse impact on performance and hence should be done once a day. Then what is wrong in our case, that is the cause of performance (we serve just 4 req/sec)? Why is optimize fixing the issue contrary to normal belief. What will this workaround impact us as the index size increase? Regds, Bipul -----Original Message----- From: Walter Underwood [mailto:wunder@...] Sent: Monday, November 02, 2009 11:18 PM To: solr-user@... Subject: Re: CPU utilization and query time high on Solr slave when snapshot install If you are going to pull a new index every 10 minutes, try turning off cache autowarming. Your caches are never more than 10 minutes old, so spending a minute warming each new cache is a waste of CPU. Autowarm submits queries to the new Searcher before putting it in service. This will create a burst of query load on the new Searcher, often keeping one CPU pretty busy for several seconds. In solrconfig.xml, set autowarmCount to 0. Also, if you want the slaves to always have an optimized index, create the snapshot only in post-optimize. If you create snapshots in both post-commit and post-optimize, you are creating a non-optimized index (post-commit), then replacing it with an optimized one a few minutes later. A slave might get a non-optimized index one time, then an optimized one the next. wunder On Nov 2, 2009, at 1:45 AM, bikumar@... wrote: > Hi Solr Gurus, > > We have solr in 1 master, 2 slave configuration. Snapshot is created > post commit, post optimization. We have autocommit after 50 > documents or 5 minutes. Snapshot puller runs as a cron every 10 > minutes. What we have observed is that whenever snapshot is > installed on the slave, we see solrj client used to query slave > solr, gets timedout and there is high CPU usage/load avg. on slave > server. If we stop snapshot puller, then slaves work with no issues. > The system has been running since 2 months and this issue has > started to occur only now when load on website is increasing. > > Following are some details: > > Solr Details: > apache-solr Version: 1.3.0 > Lucene - 2.4-dev > > Master/Slave configurations: > > Master: > - for indexing data HTTPRequests are made on Solr server. > - autocommit feature is enabled for 50 docs and 5 minutes > - caching params are disable for this server > - mergeFactor of 10 is set > - we were running optimize script after every 2 hours, but now have > reduced the duration to twice a day but issue still persists > > Slave1/Slave2: > - standard requestHandler is being used > - default values of caching are set > Machine Specifications: > > Master: > - 4GB RAM > - 1GB JVM Heap memory is allocated to Solr > > Slave1/Slave2: > - 4GB RAM > - 2GB JVM Heap memory is allocated to Solr > > Master and Slave1 (solr1)are on single box and Slave2(solr2) on > different box. We use HAProxy to load balance query requests between > 2 slaves. Master is only used for indexing. > Please let us know if somebody has ever faced similar kind of issue > or has some insight into it as we guys are literally struck at the > moment with a very unstable production environment. > > As a workaround, we have started running optimize on master every 7 > minutes. This seems to have reduced the severity of the problem but > still issue occurs every 2days now. please suggest what could be the > root cause of this. > > Thanks, > Bipul > > > > |
|
|
Re: CPU utilization and query time high on Solr slave when snapshot installOptimizing an index takes CPU, but if you are doing it on a machine
dedicated to indexing, that does not matter. It will make queries faster. wunder On Nov 3, 2009, at 2:12 AM, bikumar@... wrote: > Hi Walter, > > When the issue occurred, we did try to set autowarming off, but it > did not solve the problem. The only thing which worked, was > optimizing the slave index. > But, what you say is logical and I will try it again. > > But, the basic question I have is, our solr index is not huge by any > means. Secondly, I have read in wiki etc. that optmize has adverse > impact on performance and hence should be done once a day. Then what > is wrong in our case, that is the cause of performance (we serve > just 4 req/sec)? Why is optimize fixing the issue contrary to normal > belief. What will this workaround impact us as the index size > increase? > > Regds, > Bipul > > -----Original Message----- > From: Walter Underwood [mailto:wunder@...] > Sent: Monday, November 02, 2009 11:18 PM > To: solr-user@... > Subject: Re: CPU utilization and query time high on Solr slave when > snapshot install > > If you are going to pull a new index every 10 minutes, try turning off > cache autowarming. > > Your caches are never more than 10 minutes old, so spending a minute > warming each new cache is a waste of CPU. Autowarm submits queries to > the new Searcher before putting it in service. This will create a > burst of query load on the new Searcher, often keeping one CPU pretty > busy for several seconds. > > In solrconfig.xml, set autowarmCount to 0. > > Also, if you want the slaves to always have an optimized index, create > the snapshot only in post-optimize. If you create snapshots in both > post-commit and post-optimize, you are creating a non-optimized index > (post-commit), then replacing it with an optimized one a few minutes > later. A slave might get a non-optimized index one time, then an > optimized one the next. > > wunder > > On Nov 2, 2009, at 1:45 AM, bikumar@... wrote: > >> Hi Solr Gurus, >> >> We have solr in 1 master, 2 slave configuration. Snapshot is created >> post commit, post optimization. We have autocommit after 50 >> documents or 5 minutes. Snapshot puller runs as a cron every 10 >> minutes. What we have observed is that whenever snapshot is >> installed on the slave, we see solrj client used to query slave >> solr, gets timedout and there is high CPU usage/load avg. on slave >> server. If we stop snapshot puller, then slaves work with no issues. >> The system has been running since 2 months and this issue has >> started to occur only now when load on website is increasing. >> >> Following are some details: >> >> Solr Details: >> apache-solr Version: 1.3.0 >> Lucene - 2.4-dev >> >> Master/Slave configurations: >> >> Master: >> - for indexing data HTTPRequests are made on Solr server. >> - autocommit feature is enabled for 50 docs and 5 minutes >> - caching params are disable for this server >> - mergeFactor of 10 is set >> - we were running optimize script after every 2 hours, but now have >> reduced the duration to twice a day but issue still persists >> >> Slave1/Slave2: >> - standard requestHandler is being used >> - default values of caching are set >> Machine Specifications: >> >> Master: >> - 4GB RAM >> - 1GB JVM Heap memory is allocated to Solr >> >> Slave1/Slave2: >> - 4GB RAM >> - 2GB JVM Heap memory is allocated to Solr >> >> Master and Slave1 (solr1)are on single box and Slave2(solr2) on >> different box. We use HAProxy to load balance query requests between >> 2 slaves. Master is only used for indexing. >> Please let us know if somebody has ever faced similar kind of issue >> or has some insight into it as we guys are literally struck at the >> moment with a very unstable production environment. >> >> As a workaround, we have started running optimize on master every 7 >> minutes. This seems to have reduced the severity of the problem but >> still issue occurs every 2days now. please suggest what could be the >> root cause of this. >> >> Thanks, >> Bipul >> >> >> >> > |
|
|
Re: CPU utilization and query time high on Solr slave when snapshot installWell thats what I get for typing on my iPhone when I'm not sure of my
memory - Lance called me out on this - I put "Hmm...I think" because I wasn't positive if I remembered right, but I had thought auto warming just populates based on the old entries (probably because of the javadoc for the CacheRegenerator) and that it was ignored for the query cache - whoops - ignore my below comment - the query cache does regenerate by reissuing the search. Mark Miller wrote: > Hmm...I think you have to setup warming queries yourself and that > autowarm just copies entries from the old cache to the new cache, > rather than issuing queries - the value is how many entries it will > copy. Though that's still going to take CPU and time. > > - Mark > > http://www.lucidimagination.com (mobile) > > On Nov 2, 2009, at 12:47 PM, Walter Underwood <wunder@...> > wrote: > >> If you are going to pull a new index every 10 minutes, try turning >> off cache autowarming. >> >> Your caches are never more than 10 minutes old, so spending a minute >> warming each new cache is a waste of CPU. Autowarm submits queries to >> the new Searcher before putting it in service. This will create a >> burst of query load on the new Searcher, often keeping one CPU pretty >> busy for several seconds. >> >> In solrconfig.xml, set autowarmCount to 0. >> >> Also, if you want the slaves to always have an optimized index, >> create the snapshot only in post-optimize. If you create snapshots in >> both post-commit and post-optimize, you are creating a non-optimized >> index (post-commit), then replacing it with an optimized one a few >> minutes later. A slave might get a non-optimized index one time, then >> an optimized one the next. >> >> wunder >> >> On Nov 2, 2009, at 1:45 AM, bikumar@... wrote: >> >>> Hi Solr Gurus, >>> >>> We have solr in 1 master, 2 slave configuration. Snapshot is created >>> post commit, post optimization. We have autocommit after 50 >>> documents or 5 minutes. Snapshot puller runs as a cron every 10 >>> minutes. What we have observed is that whenever snapshot is >>> installed on the slave, we see solrj client used to query slave >>> solr, gets timedout and there is high CPU usage/load avg. on slave >>> server. If we stop snapshot puller, then slaves work with no issues. >>> The system has been running since 2 months and this issue has >>> started to occur only now when load on website is increasing. >>> >>> Following are some details: >>> >>> Solr Details: >>> apache-solr Version: 1.3.0 >>> Lucene - 2.4-dev >>> >>> Master/Slave configurations: >>> >>> Master: >>> - for indexing data HTTPRequests are made on Solr server. >>> - autocommit feature is enabled for 50 docs and 5 minutes >>> - caching params are disable for this server >>> - mergeFactor of 10 is set >>> - we were running optimize script after every 2 hours, but now have >>> reduced the duration to twice a day but issue still persists >>> >>> Slave1/Slave2: >>> - standard requestHandler is being used >>> - default values of caching are set >>> Machine Specifications: >>> >>> Master: >>> - 4GB RAM >>> - 1GB JVM Heap memory is allocated to Solr >>> >>> Slave1/Slave2: >>> - 4GB RAM >>> - 2GB JVM Heap memory is allocated to Solr >>> >>> Master and Slave1 (solr1)are on single box and Slave2(solr2) on >>> different box. We use HAProxy to load balance query requests between >>> 2 slaves. Master is only used for indexing. >>> Please let us know if somebody has ever faced similar kind of issue >>> or has some insight into it as we guys are literally struck at the >>> moment with a very unstable production environment. >>> >>> As a workaround, we have started running optimize on master every 7 >>> minutes. This seems to have reduced the severity of the problem but >>> still issue occurs every 2days now. please suggest what could be the >>> root cause of this. >>> >>> Thanks, >>> Bipul >>> >>> >>> >>> >> -- - Mark http://www.lucidimagination.com |
|
|
Re: CPU utilization and query time high on Solr slave when snapshot installBipul,
From what you wrote, I would guess this index is pretty small, so the OS can quickly cache it all over again, even after you optimize it. Are you saying your slaves hit the limit at 4 reqs/second? Are those single CPU core slaves? Otis -- Sematext is hiring -- http://sematext.com/about/jobs.html?mls Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR ----- Original Message ---- > From: "bikumar@..." <bikumar@...> > To: "solr-user@..." <solr-user@...> > Sent: Tue, November 3, 2009 5:12:04 AM > Subject: RE: CPU utilization and query time high on Solr slave when snapshot install > > Hi Walter, > > When the issue occurred, we did try to set autowarming off, but it did not solve > the problem. The only thing which worked, was optimizing the slave index. > But, what you say is logical and I will try it again. > > But, the basic question I have is, our solr index is not huge by any means. > Secondly, I have read in wiki etc. that optmize has adverse impact on > performance and hence should be done once a day. Then what is wrong in our case, > that is the cause of performance (we serve just 4 req/sec)? Why is optimize > fixing the issue contrary to normal belief. What will this workaround impact us > as the index size increase? > > Regds, > Bipul > > -----Original Message----- > From: Walter Underwood [mailto:wunder@...] > Sent: Monday, November 02, 2009 11:18 PM > To: solr-user@... > Subject: Re: CPU utilization and query time high on Solr slave when snapshot > install > > If you are going to pull a new index every 10 minutes, try turning off > cache autowarming. > > Your caches are never more than 10 minutes old, so spending a minute > warming each new cache is a waste of CPU. Autowarm submits queries to > the new Searcher before putting it in service. This will create a > burst of query load on the new Searcher, often keeping one CPU pretty > busy for several seconds. > > In solrconfig.xml, set autowarmCount to 0. > > Also, if you want the slaves to always have an optimized index, create > the snapshot only in post-optimize. If you create snapshots in both > post-commit and post-optimize, you are creating a non-optimized index > (post-commit), then replacing it with an optimized one a few minutes > later. A slave might get a non-optimized index one time, then an > optimized one the next. > > wunder > > On Nov 2, 2009, at 1:45 AM, bikumar@... wrote: > > > Hi Solr Gurus, > > > > We have solr in 1 master, 2 slave configuration. Snapshot is created > > post commit, post optimization. We have autocommit after 50 > > documents or 5 minutes. Snapshot puller runs as a cron every 10 > > minutes. What we have observed is that whenever snapshot is > > installed on the slave, we see solrj client used to query slave > > solr, gets timedout and there is high CPU usage/load avg. on slave > > server. If we stop snapshot puller, then slaves work with no issues. > > The system has been running since 2 months and this issue has > > started to occur only now when load on website is increasing. > > > > Following are some details: > > > > Solr Details: > > apache-solr Version: 1.3.0 > > Lucene - 2.4-dev > > > > Master/Slave configurations: > > > > Master: > > - for indexing data HTTPRequests are made on Solr server. > > - autocommit feature is enabled for 50 docs and 5 minutes > > - caching params are disable for this server > > - mergeFactor of 10 is set > > - we were running optimize script after every 2 hours, but now have > > reduced the duration to twice a day but issue still persists > > > > Slave1/Slave2: > > - standard requestHandler is being used > > - default values of caching are set > > Machine Specifications: > > > > Master: > > - 4GB RAM > > - 1GB JVM Heap memory is allocated to Solr > > > > Slave1/Slave2: > > - 4GB RAM > > - 2GB JVM Heap memory is allocated to Solr > > > > Master and Slave1 (solr1)are on single box and Slave2(solr2) on > > different box. We use HAProxy to load balance query requests between > > 2 slaves. Master is only used for indexing. > > Please let us know if somebody has ever faced similar kind of issue > > or has some insight into it as we guys are literally struck at the > > moment with a very unstable production environment. > > > > As a workaround, we have started running optimize on master every 7 > > minutes. This seems to have reduced the severity of the problem but > > still issue occurs every 2days now. please suggest what could be the > > root cause of this. > > > > Thanks, > > Bipul > > > > > > > > |
|
|
RE: CPU utilization and query time high on Solr slave when snapshot installHi Otis,
There are 2 search servers are in a virtualized VMware environment. Each has 2 instances of Solr running on separates ports in tomcat. Server 1: hosts 1 master(application 1), 1 slave (application 1) Server 2: hosta 1 master (application 2), 1 slave (application 1) Both servers have 4 CPUs and 4 GB RAM. The problem is reported on slaves for application 1. The SolrJ client which queries Solr over HTTP times out (10 sec is the timeout value) though in the Solr tomcat access log we find all requests have 200 response. During the time, requests timeout the load avg. of the server goes extremely high (10-20). The issue gets resolved as soon as we optimize the slave index. In the solr admin, it shows only 4 requests/sec is handled with 400 ms response time. I do not think this is in any way high traffic and Solr theoritically can support much more volume of traffic. Hence, there is some thing seriously wrong in the setup or configuration which I am not able to figure out. Does something clicks in your mind which we should check? Thanks, Bipul -----Original Message----- From: Otis Gospodnetic [mailto:otis_gospodnetic@...] Sent: Thursday, November 05, 2009 10:32 AM To: solr-user@... Subject: Re: CPU utilization and query time high on Solr slave when snapshot install Bipul, From what you wrote, I would guess this index is pretty small, so the OS can quickly cache it all over again, even after you optimize it. Are you saying your slaves hit the limit at 4 reqs/second? Are those single CPU core slaves? Otis -- Sematext is hiring -- http://sematext.com/about/jobs.html?mls Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR ----- Original Message ---- > From: "bikumar@..." <bikumar@...> > To: "solr-user@..." <solr-user@...> > Sent: Tue, November 3, 2009 5:12:04 AM > Subject: RE: CPU utilization and query time high on Solr slave when snapshot install > > Hi Walter, > > When the issue occurred, we did try to set autowarming off, but it did not solve > the problem. The only thing which worked, was optimizing the slave index. > But, what you say is logical and I will try it again. > > But, the basic question I have is, our solr index is not huge by any means. > Secondly, I have read in wiki etc. that optmize has adverse impact on > performance and hence should be done once a day. Then what is wrong in our case, > that is the cause of performance (we serve just 4 req/sec)? Why is optimize > fixing the issue contrary to normal belief. What will this workaround impact us > as the index size increase? > > Regds, > Bipul > > -----Original Message----- > From: Walter Underwood [mailto:wunder@...] > Sent: Monday, November 02, 2009 11:18 PM > To: solr-user@... > Subject: Re: CPU utilization and query time high on Solr slave when snapshot > install > > If you are going to pull a new index every 10 minutes, try turning off > cache autowarming. > > Your caches are never more than 10 minutes old, so spending a minute > warming each new cache is a waste of CPU. Autowarm submits queries to > the new Searcher before putting it in service. This will create a > burst of query load on the new Searcher, often keeping one CPU pretty > busy for several seconds. > > In solrconfig.xml, set autowarmCount to 0. > > Also, if you want the slaves to always have an optimized index, create > the snapshot only in post-optimize. If you create snapshots in both > post-commit and post-optimize, you are creating a non-optimized index > (post-commit), then replacing it with an optimized one a few minutes > later. A slave might get a non-optimized index one time, then an > optimized one the next. > > wunder > > On Nov 2, 2009, at 1:45 AM, bikumar@... wrote: > > > Hi Solr Gurus, > > > > We have solr in 1 master, 2 slave configuration. Snapshot is created > > post commit, post optimization. We have autocommit after 50 > > documents or 5 minutes. Snapshot puller runs as a cron every 10 > > minutes. What we have observed is that whenever snapshot is > > installed on the slave, we see solrj client used to query slave > > solr, gets timedout and there is high CPU usage/load avg. on slave > > server. If we stop snapshot puller, then slaves work with no issues. > > The system has been running since 2 months and this issue has > > started to occur only now when load on website is increasing. > > > > Following are some details: > > > > Solr Details: > > apache-solr Version: 1.3.0 > > Lucene - 2.4-dev > > > > Master/Slave configurations: > > > > Master: > > - for indexing data HTTPRequests are made on Solr server. > > - autocommit feature is enabled for 50 docs and 5 minutes > > - caching params are disable for this server > > - mergeFactor of 10 is set > > - we were running optimize script after every 2 hours, but now have > > reduced the duration to twice a day but issue still persists > > > > Slave1/Slave2: > > - standard requestHandler is being used > > - default values of caching are set > > Machine Specifications: > > > > Master: > > - 4GB RAM > > - 1GB JVM Heap memory is allocated to Solr > > > > Slave1/Slave2: > > - 4GB RAM > > - 2GB JVM Heap memory is allocated to Solr > > > > Master and Slave1 (solr1)are on single box and Slave2(solr2) on > > different box. We use HAProxy to load balance query requests between > > 2 slaves. Master is only used for indexing. > > Please let us know if somebody has ever faced similar kind of issue > > or has some insight into it as we guys are literally struck at the > > moment with a very unstable production environment. > > > > As a workaround, we have started running optimize on master every 7 > > minutes. This seems to have reduced the severity of the problem but > > still issue occurs every 2days now. please suggest what could be the > > root cause of this. > > > > Thanks, > > Bipul > > > > > > > > |
|
|
Re: CPU utilization and query time high on Solr slave when snapshot installBigger heap?
Non-VMWare boxes? Otis -- Sematext is hiring -- http://sematext.com/about/jobs.html?mls Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR ----- Original Message ---- > From: Bipul Kumar II <bikumar@...> > To: "solr-user@..." <solr-user@...> > Sent: Thu, November 5, 2009 12:25:14 AM > Subject: RE: CPU utilization and query time high on Solr slave when snapshot install > > Hi Otis, > > There are 2 search servers are in a virtualized VMware environment. Each has 2 > instances of Solr running on separates ports in tomcat. > Server 1: hosts 1 master(application 1), 1 slave (application 1) > Server 2: hosta 1 master (application 2), 1 slave (application 1) > > Both servers have 4 CPUs and 4 GB RAM. > > The problem is reported on slaves for application 1. The SolrJ client which > queries Solr over HTTP times out (10 sec is the timeout value) though in the > Solr tomcat access log we find all requests have 200 response. During the time, > requests timeout the load avg. of the server goes extremely high (10-20). > The issue gets resolved as soon as we optimize the slave index. > > In the solr admin, it shows only 4 requests/sec is handled with 400 ms response > time. > > I do not think this is in any way high traffic and Solr theoritically can > support much more volume of traffic. Hence, there is some thing seriously wrong > in the setup or configuration which I am not able to figure out. > > Does something clicks in your mind which we should check? > > Thanks, > Bipul > > -----Original Message----- > From: Otis Gospodnetic [mailto:otis_gospodnetic@...] > Sent: Thursday, November 05, 2009 10:32 AM > To: solr-user@... > Subject: Re: CPU utilization and query time high on Solr slave when snapshot > install > > Bipul, > > From what you wrote, I would guess this index is pretty small, so the OS can > quickly cache it all over again, even after you optimize it. > Are you saying your slaves hit the limit at 4 reqs/second? > Are those single CPU core slaves? > > Otis > -- > Sematext is hiring -- http://sematext.com/about/jobs.html?mls > Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR > > > > ----- Original Message ---- > > From: "bikumar@..." > > To: "solr-user@..." > > Sent: Tue, November 3, 2009 5:12:04 AM > > Subject: RE: CPU utilization and query time high on Solr slave when snapshot > install > > > > Hi Walter, > > > > When the issue occurred, we did try to set autowarming off, but it did not > solve > > the problem. The only thing which worked, was optimizing the slave index. > > But, what you say is logical and I will try it again. > > > > But, the basic question I have is, our solr index is not huge by any means. > > Secondly, I have read in wiki etc. that optmize has adverse impact on > > performance and hence should be done once a day. Then what is wrong in our > case, > > that is the cause of performance (we serve just 4 req/sec)? Why is optimize > > fixing the issue contrary to normal belief. What will this workaround impact > us > > as the index size increase? > > > > Regds, > > Bipul > > > > -----Original Message----- > > From: Walter Underwood [mailto:wunder@...] > > Sent: Monday, November 02, 2009 11:18 PM > > To: solr-user@... > > Subject: Re: CPU utilization and query time high on Solr slave when snapshot > > install > > > > If you are going to pull a new index every 10 minutes, try turning off > > cache autowarming. > > > > Your caches are never more than 10 minutes old, so spending a minute > > warming each new cache is a waste of CPU. Autowarm submits queries to > > the new Searcher before putting it in service. This will create a > > burst of query load on the new Searcher, often keeping one CPU pretty > > busy for several seconds. > > > > In solrconfig.xml, set autowarmCount to 0. > > > > Also, if you want the slaves to always have an optimized index, create > > the snapshot only in post-optimize. If you create snapshots in both > > post-commit and post-optimize, you are creating a non-optimized index > > (post-commit), then replacing it with an optimized one a few minutes > > later. A slave might get a non-optimized index one time, then an > > optimized one the next. > > > > wunder > > > > On Nov 2, 2009, at 1:45 AM, bikumar@... wrote: > > > > > Hi Solr Gurus, > > > > > > We have solr in 1 master, 2 slave configuration. Snapshot is created > > > post commit, post optimization. We have autocommit after 50 > > > documents or 5 minutes. Snapshot puller runs as a cron every 10 > > > minutes. What we have observed is that whenever snapshot is > > > installed on the slave, we see solrj client used to query slave > > > solr, gets timedout and there is high CPU usage/load avg. on slave > > > server. If we stop snapshot puller, then slaves work with no issues. > > > The system has been running since 2 months and this issue has > > > started to occur only now when load on website is increasing. > > > > > > Following are some details: > > > > > > Solr Details: > > > apache-solr Version: 1.3.0 > > > Lucene - 2.4-dev > > > > > > Master/Slave configurations: > > > > > > Master: > > > - for indexing data HTTPRequests are made on Solr server. > > > - autocommit feature is enabled for 50 docs and 5 minutes > > > - caching params are disable for this server > > > - mergeFactor of 10 is set > > > - we were running optimize script after every 2 hours, but now have > > > reduced the duration to twice a day but issue still persists > > > > > > Slave1/Slave2: > > > - standard requestHandler is being used > > > - default values of caching are set > > > Machine Specifications: > > > > > > Master: > > > - 4GB RAM > > > - 1GB JVM Heap memory is allocated to Solr > > > > > > Slave1/Slave2: > > > - 4GB RAM > > > - 2GB JVM Heap memory is allocated to Solr > > > > > > Master and Slave1 (solr1)are on single box and Slave2(solr2) on > > > different box. We use HAProxy to load balance query requests between > > > 2 slaves. Master is only used for indexing. > > > Please let us know if somebody has ever faced similar kind of issue > > > or has some insight into it as we guys are literally struck at the > > > moment with a very unstable production environment. > > > > > > As a workaround, we have started running optimize on master every 7 > > > minutes. This seems to have reduced the severity of the problem but > > > still issue occurs every 2days now. please suggest what could be the > > > root cause of this. > > > > > > Thanks, > > > Bipul > > > > > > > > > > > > |
|
|
RE: CPU utilization and query time high on Solr slave when snapshot installThese machines are 32 bit so max we cal allocate is 2GB to the process. Non-VMWare boxes are not a possibility. Are there any known issue with ESXVMware and solr? What kind of h/w will be required to maintain 15-20 mb index and query lets say 10,100, 200 req/sec. we are struggling with 4-5 req/sec.
Thanks, Bipul -----Original Message----- From: Otis Gospodnetic [mailto:otis_gospodnetic@...] Sent: Thursday, November 05, 2009 11:08 AM To: solr-user@... Subject: Re: CPU utilization and query time high on Solr slave when snapshot install Bigger heap? Non-VMWare boxes? Otis -- Sematext is hiring -- http://sematext.com/about/jobs.html?mls Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR ----- Original Message ---- > From: Bipul Kumar II <bikumar@...> > To: "solr-user@..." <solr-user@...> > Sent: Thu, November 5, 2009 12:25:14 AM > Subject: RE: CPU utilization and query time high on Solr slave when snapshot install > > Hi Otis, > > There are 2 search servers are in a virtualized VMware environment. Each has 2 > instances of Solr running on separates ports in tomcat. > Server 1: hosts 1 master(application 1), 1 slave (application 1) > Server 2: hosta 1 master (application 2), 1 slave (application 1) > > Both servers have 4 CPUs and 4 GB RAM. > > The problem is reported on slaves for application 1. The SolrJ client which > queries Solr over HTTP times out (10 sec is the timeout value) though in the > Solr tomcat access log we find all requests have 200 response. During the time, > requests timeout the load avg. of the server goes extremely high (10-20). > The issue gets resolved as soon as we optimize the slave index. > > In the solr admin, it shows only 4 requests/sec is handled with 400 ms response > time. > > I do not think this is in any way high traffic and Solr theoritically can > support much more volume of traffic. Hence, there is some thing seriously wrong > in the setup or configuration which I am not able to figure out. > > Does something clicks in your mind which we should check? > > Thanks, > Bipul > > -----Original Message----- > From: Otis Gospodnetic [mailto:otis_gospodnetic@...] > Sent: Thursday, November 05, 2009 10:32 AM > To: solr-user@... > Subject: Re: CPU utilization and query time high on Solr slave when snapshot > install > > Bipul, > > From what you wrote, I would guess this index is pretty small, so the OS can > quickly cache it all over again, even after you optimize it. > Are you saying your slaves hit the limit at 4 reqs/second? > Are those single CPU core slaves? > > Otis > -- > Sematext is hiring -- http://sematext.com/about/jobs.html?mls > Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR > > > > ----- Original Message ---- > > From: "bikumar@..." > > To: "solr-user@..." > > Sent: Tue, November 3, 2009 5:12:04 AM > > Subject: RE: CPU utilization and query time high on Solr slave when snapshot > install > > > > Hi Walter, > > > > When the issue occurred, we did try to set autowarming off, but it did not > solve > > the problem. The only thing which worked, was optimizing the slave index. > > But, what you say is logical and I will try it again. > > > > But, the basic question I have is, our solr index is not huge by any means. > > Secondly, I have read in wiki etc. that optmize has adverse impact on > > performance and hence should be done once a day. Then what is wrong in our > case, > > that is the cause of performance (we serve just 4 req/sec)? Why is optimize > > fixing the issue contrary to normal belief. What will this workaround impact > us > > as the index size increase? > > > > Regds, > > Bipul > > > > -----Original Message----- > > From: Walter Underwood [mailto:wunder@...] > > Sent: Monday, November 02, 2009 11:18 PM > > To: solr-user@... > > Subject: Re: CPU utilization and query time high on Solr slave when snapshot > > install > > > > If you are going to pull a new index every 10 minutes, try turning off > > cache autowarming. > > > > Your caches are never more than 10 minutes old, so spending a minute > > warming each new cache is a waste of CPU. Autowarm submits queries to > > the new Searcher before putting it in service. This will create a > > burst of query load on the new Searcher, often keeping one CPU pretty > > busy for several seconds. > > > > In solrconfig.xml, set autowarmCount to 0. > > > > Also, if you want the slaves to always have an optimized index, create > > the snapshot only in post-optimize. If you create snapshots in both > > post-commit and post-optimize, you are creating a non-optimized index > > (post-commit), then replacing it with an optimized one a few minutes > > later. A slave might get a non-optimized index one time, then an > > optimized one the next. > > > > wunder > > > > On Nov 2, 2009, at 1:45 AM, bikumar@... wrote: > > > > > Hi Solr Gurus, > > > > > > We have solr in 1 master, 2 slave configuration. Snapshot is created > > > post commit, post optimization. We have autocommit after 50 > > > documents or 5 minutes. Snapshot puller runs as a cron every 10 > > > minutes. What we have observed is that whenever snapshot is > > > installed on the slave, we see solrj client used to query slave > > > solr, gets timedout and there is high CPU usage/load avg. on slave > > > server. If we stop snapshot puller, then slaves work with no issues. > > > The system has been running since 2 months and this issue has > > > started to occur only now when load on website is increasing. > > > > > > Following are some details: > > > > > > Solr Details: > > > apache-solr Version: 1.3.0 > > > Lucene - 2.4-dev > > > > > > Master/Slave configurations: > > > > > > Master: > > > - for indexing data HTTPRequests are made on Solr server. > > > - autocommit feature is enabled for 50 docs and 5 minutes > > > - caching params are disable for this server > > > - mergeFactor of 10 is set > > > - we were running optimize script after every 2 hours, but now have > > > reduced the duration to twice a day but issue still persists > > > > > > Slave1/Slave2: > > > - standard requestHandler is being used > > > - default values of caching are set > > > Machine Specifications: > > > > > > Master: > > > - 4GB RAM > > > - 1GB JVM Heap memory is allocated to Solr > > > > > > Slave1/Slave2: > > > - 4GB RAM > > > - 2GB JVM Heap memory is allocated to Solr > > > > > > Master and Slave1 (solr1)are on single box and Slave2(solr2) on > > > different box. We use HAProxy to load balance query requests between > > > 2 slaves. Master is only used for indexing. > > > Please let us know if somebody has ever faced similar kind of issue > > > or has some insight into it as we guys are literally struck at the > > > moment with a very unstable production environment. > > > > > > As a workaround, we have started running optimize on master every 7 > > > minutes. This seems to have reduced the severity of the problem but > > > still issue occurs every 2days now. please suggest what could be the > > > root cause of this. > > > > > > Thanks, > > > Bipul > > > > > > > > > > > > |
|
|
RE: CPU utilization and query time high on Solr slave when snapshot installHi Bipul,
Is it possible to attach solrconfig.xml present within Master and Slave core's? ~Vikrant
|
|
|
Re: CPU utilization and query time high on Solr slave when snapshot installWhile I don't have anything specific relating to VMWare boxes, we're
able to test with dedicated hardware up to 400 qps pretty easily. We're on dual quad core servers with either 16 or 32Gb of ram (I can't quite remember which). I'd check that you're not having some sort of IO issue on your box that's hosting the VMWare servers, be it disk or network. With a 20 or 30 meg index, it should be very easy to scale up to hundreds of queries per second. Are you using Tomcat? If so, what is your maxThreads setting set to? --Matthew Runo On Fri, Nov 6, 2009 at 9:25 AM, Vicky_Dev <vikrantv_shirbhate@...> wrote: > > Hi Bipul, > > Is it possible to attach solrconfig.xml present within Master and Slave > core's? > > > ~Vikrant > > > > > bikumar@... wrote: >> >> Hi Solr Gurus, >> >> We have solr in 1 master, 2 slave configuration. Snapshot is created post >> commit, post optimization. We have autocommit after 50 documents or 5 >> minutes. Snapshot puller runs as a cron every 10 minutes. What we have >> observed is that whenever snapshot is installed on the slave, we see solrj >> client used to query slave solr, gets timedout and there is high CPU >> usage/load avg. on slave server. If we stop snapshot puller, then slaves >> work with no issues. The system has been running since 2 months and this >> issue has started to occur only now when load on website is increasing. >> >> Following are some details: >> >> Solr Details: >> apache-solr Version: 1.3.0 >> Lucene - 2.4-dev >> >> Master/Slave configurations: >> >> Master: >> - for indexing data HTTPRequests are made on Solr server. >> - autocommit feature is enabled for 50 docs and 5 minutes >> - caching params are disable for this server >> - mergeFactor of 10 is set >> - we were running optimize script after every 2 hours, but now have >> reduced the duration to twice a day but issue still persists >> >> Slave1/Slave2: >> - standard requestHandler is being used >> - default values of caching are set >> Machine Specifications: >> >> Master: >> - 4GB RAM >> - 1GB JVM Heap memory is allocated to Solr >> >> Slave1/Slave2: >> - 4GB RAM >> - 2GB JVM Heap memory is allocated to Solr >> >> Master and Slave1 (solr1)are on single box and Slave2(solr2) on different >> box. We use HAProxy to load balance query requests between 2 slaves. >> Master is only used for indexing. >> Please let us know if somebody has ever faced similar kind of issue or has >> some insight into it as we guys are literally struck at the moment with a >> very unstable production environment. >> >> As a workaround, we have started running optimize on master every 7 >> minutes. This seems to have reduced the severity of the problem but still >> issue occurs every 2days now. please suggest what could be the root cause >> of this. >> >> Thanks, >> Bipul >> >> >> >> >> >> > > -- > View this message in context: http://old.nabble.com/RE%3A-CPU-utilization-and-query-time-high-on-Solr-slave-when-snapshot-install-tp26161174p26230840.html > Sent from the Solr - User mailing list archive at Nabble.com. > > |
|
|
Solr Replication: How to restore data from last snapshotHi,
I have followed Solr set up ReplicationHandler for index replication to slave. Do anyone know how to restore corrupted index from snapshot in master, and force replication of the restored index to slave? Thanks, Osborn |
|
|
Re: Solr Replication: How to restore data from last snapshotIf your master index is corrupt and it hasn't been replicated out, you
should be able to shut down the server and remove the corrupted index files. Then copy the replicated index back onto the master and start everything back up. As far as I know, the indexes on the replicated slaves are exactly what you'd have on the master, so this method should work. --Matthew Runo On Fri, Nov 6, 2009 at 11:41 AM, Osborn Chan <ochan@...> wrote: > Hi, > > I have followed Solr set up ReplicationHandler for index replication to slave. > Do anyone know how to restore corrupted index from snapshot in master, and force replication of the restored index to slave? > > > Thanks, > > Osborn > |
|
|
RE: Solr Replication: How to restore data from last snapshotThanks. But I have following use cases:
1) Master index is corrupted, but it didn't replicate to slave servers. - In this case, I only need to restore to last snapshot. 2) Master index is corrupted, and it has replicated to slave servers. - In this case, I need to restore to last snapshot, and make sure slave servers replicate the restored index from index server as well. Assuming both cases are in production environment, and I cannot shutdown the master and slave servers. Is there any rest API call or something else I can do without manually using linux command and restart? Thanks, Osborn -----Original Message----- From: Matthew Runo [mailto:matthew.runo@...] Sent: Friday, November 06, 2009 12:20 PM To: solr-user@... Subject: Re: Solr Replication: How to restore data from last snapshot If your master index is corrupt and it hasn't been replicated out, you should be able to shut down the server and remove the corrupted index files. Then copy the replicated index back onto the master and start everything back up. As far as I know, the indexes on the replicated slaves are exactly what you'd have on the master, so this method should work. --Matthew Runo On Fri, Nov 6, 2009 at 11:41 AM, Osborn Chan <ochan@...> wrote: > Hi, > > I have followed Solr set up ReplicationHandler for index replication to slave. > Do anyone know how to restore corrupted index from snapshot in master, and force replication of the restored index to slave? > > > Thanks, > > Osborn > |
|
|
Re: Solr Replication: How to restore data from last snapshotif it is a single core you will have to restart the master
On Sat, Nov 7, 2009 at 1:55 AM, Osborn Chan <ochan@...> wrote: > Thanks. But I have following use cases: > > 1) Master index is corrupted, but it didn't replicate to slave servers. > - In this case, I only need to restore to last snapshot. > 2) Master index is corrupted, and it has replicated to slave servers. > - In this case, I need to restore to last snapshot, and make sure slave servers replicate the restored index from index server as well. > > Assuming both cases are in production environment, and I cannot shutdown the master and slave servers. > Is there any rest API call or something else I can do without manually using linux command and restart? > > Thanks, > > Osborn > > -----Original Message----- > From: Matthew Runo [mailto:matthew.runo@...] > Sent: Friday, November 06, 2009 12:20 PM > To: solr-user@... > Subject: Re: Solr Replication: How to restore data from last snapshot > > If your master index is corrupt and it hasn't been replicated out, you > should be able to shut down the server and remove the corrupted index > files. Then copy the replicated index back onto the master and start > everything back up. > > As far as I know, the indexes on the replicated slaves are exactly > what you'd have on the master, so this method should work. > > --Matthew Runo > > On Fri, Nov 6, 2009 at 11:41 AM, Osborn Chan <ochan@...> wrote: >> Hi, >> >> I have followed Solr set up ReplicationHandler for index replication to slave. >> Do anyone know how to restore corrupted index from snapshot in master, and force replication of the restored index to slave? >> >> >> Thanks, >> >> Osborn >> > -- ----------------------------------------------------- Noble Paul | Principal Engineer| AOL | http://aol.com |
|
|
Re: Solr Replication: How to restore data from last snapshot: Subject: Solr Replication: How to restore data from last snapshot : References: <8950E934DB69A040A1783438E67293D813DA3F6267@...> : <26230840.post@...> : <f4d979d0911061116l14fa3a49sc5a71d855857ca13@...> : In-Reply-To: <f4d979d0911061116l14fa3a49sc5a71d855857ca13@...> http://people.apache.org/~hossman/#threadhijack Thread Hijacking on Mailing Lists When starting a new discussion on a mailing list, please do not reply to an existing message, instead start a fresh email. Even if you change the subject line of your email, other mail headers still track which thread you replied to and your question is "hidden" in that thread and gets less attention. It makes following discussions in the mailing list archives particularly difficult. See Also: http://en.wikipedia.org/wiki/Thread_hijacking -Hoss |
|
|
RE: Solr Replication: How to restore data from last snapshotWhat happen if it is multiple core?
Thanks -----Original Message----- From: noble.paul@... [mailto:noble.paul@...] On Behalf Of Noble Paul ??????? ?????? Sent: Friday, November 06, 2009 10:49 PM To: solr-user@... Subject: Re: Solr Replication: How to restore data from last snapshot if it is a single core you will have to restart the master On Sat, Nov 7, 2009 at 1:55 AM, Osborn Chan <ochan@...> wrote: > Thanks. But I have following use cases: > > 1) Master index is corrupted, but it didn't replicate to slave servers. > - In this case, I only need to restore to last snapshot. > 2) Master index is corrupted, and it has replicated to slave servers. > - In this case, I need to restore to last snapshot, and make sure slave servers replicate the restored index from index server as well. > > Assuming both cases are in production environment, and I cannot shutdown the master and slave servers. > Is there any rest API call or something else I can do without manually using linux command and restart? > > Thanks, > > Osborn > > -----Original Message----- > From: Matthew Runo [mailto:matthew.runo@...] > Sent: Friday, November 06, 2009 12:20 PM > To: solr-user@... > Subject: Re: Solr Replication: How to restore data from last snapshot > > If your master index is corrupt and it hasn't been replicated out, you > should be able to shut down the server and remove the corrupted index > files. Then copy the replicated index back onto the master and start > everything back up. > > As far as I know, the indexes on the replicated slaves are exactly > what you'd have on the master, so this method should work. > > --Matthew Runo > > On Fri, Nov 6, 2009 at 11:41 AM, Osborn Chan <ochan@...> wrote: >> Hi, >> >> I have followed Solr set up ReplicationHandler for index replication to slave. >> Do anyone know how to restore corrupted index from snapshot in master, and force replication of the restored index to slave? >> >> >> Thanks, >> >> Osborn >> > -- ----------------------------------------------------- Noble Paul | Principal Engineer| AOL | http://aol.com |
|
|
RE: CPU utilization and query time high on Solr slave when snapshot installHi,
Matthew Runo: tomcat connection setting are : maxThreads="150" minSpareThreads="25" maxSpareThreads="75" acceptCount="150". You also mentioned about checking if there is any IO issue. Can you give me some pointers how can I test that and what should be the benchmark to verify it against? Vicky_Dev: Please find attached the solrconfig for master and slave attached. Let me know in case you find something alarming. Thanks, Bipul -----Original Message----- From: Vicky_Dev [mailto:vikrantv_shirbhate@...] Sent: Friday, November 06, 2009 10:55 PM To: solr-user@... Subject: RE: CPU utilization and query time high on Solr slave when snapshot install Hi Bipul, Is it possible to attach solrconfig.xml present within Master and Slave core's? ~Vikrant bikumar@... wrote: > > Hi Solr Gurus, > > We have solr in 1 master, 2 slave configuration. Snapshot is created post > commit, post optimization. We have autocommit after 50 documents or 5 > minutes. Snapshot puller runs as a cron every 10 minutes. What we have > observed is that whenever snapshot is installed on the slave, we see solrj > client used to query slave solr, gets timedout and there is high CPU > usage/load avg. on slave server. If we stop snapshot puller, then slaves > work with no issues. The system has been running since 2 months and this > issue has started to occur only now when load on website is increasing. > > Following are some details: > > Solr Details: > apache-solr Version: 1.3.0 > Lucene - 2.4-dev > > Master/Slave configurations: > > Master: > - for indexing data HTTPRequests are made on Solr server. > - autocommit feature is enabled for 50 docs and 5 minutes > - caching params are disable for this server > - mergeFactor of 10 is set > - we were running optimize script after every 2 hours, but now have > reduced the duration to twice a day but issue still persists > > Slave1/Slave2: > - standard requestHandler is being used > - default values of caching are set > Machine Specifications: > > Master: > - 4GB RAM > - 1GB JVM Heap memory is allocated to Solr > > Slave1/Slave2: > - 4GB RAM > - 2GB JVM Heap memory is allocated to Solr > > Master and Slave1 (solr1)are on single box and Slave2(solr2) on different > box. We use HAProxy to load balance query requests between 2 slaves. > Master is only used for indexing. > Please let us know if somebody has ever faced similar kind of issue or has > some insight into it as we guys are literally struck at the moment with a > very unstable production environment. > > As a workaround, we have started running optimize on master every 7 > minutes. This seems to have reduced the severity of the problem but still > issue occurs every 2days now. please suggest what could be the root cause > of this. > > Thanks, > Bipul > > > > > > View this message in context: http://old.nabble.com/RE%3A-CPU-utilization-and-query-time-high-on-Solr-slave-when-snapshot-install-tp26161174p26230840.html Sent from the Solr - User mailing list archive at Nabble.com. <?xml version="1.0" encoding="UTF-8" ?> <!-- Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --> <config> <!-- Set this to 'false' if you want solr to continue working after it has encountered an severe configuration error. In a production environment, you may want solr to keep working even if one handler is mis-configured. You may also set this to false using by setting the system property: -Dsolr.abortOnConfigurationError=false --> <abortOnConfigurationError>${solr.abortOnConfigurationError:true}</abortOnConfigurationError> <!-- Used to specify an alternate directory to hold all index data other than the default ./data under the Solr home. If replication is in use, this should match the replication configuration. --> <dataDir>${solr.data.dir:/opt/solr/solr_slave/solr/data}</dataDir> <indexDefaults> <!-- Values here affect all index writers and act as a default unless overridden. --> <useCompoundFile>false</useCompoundFile> <!-- <mergeFactor>10</mergeFactor> --> <!-- If both ramBufferSizeMB and maxBufferedDocs is set, then Lucene will flush based on whichever limit is hit first. --> <!--<maxBufferedDocs>1000</maxBufferedDocs>--> <!-- Tell Lucene when to flush documents to disk. Giving Lucene more memory for indexing means faster indexing at the cost of more RAM If both ramBufferSizeMB and maxBufferedDocs is set, then Lucene will flush based on whichever limit is hit first. --> <ramBufferSizeMB>256</ramBufferSizeMB> <maxMergeDocs>2147483647</maxMergeDocs> <maxFieldLength>10000</maxFieldLength> <writeLockTimeout>1000</writeLockTimeout> <commitLockTimeout>10000</commitLockTimeout> <!-- Expert: Turn on Lucene's auto commit capability. This causes intermediate segment flushes to write a new lucene index descriptor, enabling it to be opened by an external IndexReader. NOTE: Despite the name, this value does not have any relation to Solr's autoCommit functionality --> <!--<luceneAutoCommit>false</luceneAutoCommit>--> <!-- Expert: The Merge Policy in Lucene controls how merging is handled by Lucene. The default in 2.3 is the LogByteSizeMergePolicy, previous versions used LogDocMergePolicy. LogByteSizeMergePolicy chooses segments to merge based on their size. The Lucene 2.2 default, LogDocMergePolicy chose when to merge based on number of documents Other implementations of MergePolicy must have a no-argument constructor --> <!--<mergePolicy>org.apache.lucene.index.LogByteSizeMergePolicy</mergePolicy>--> <!-- Expert: The Merge Scheduler in Lucene controls how merges are performed. The ConcurrentMergeScheduler (Lucene 2.3 default) can perform merges in the background using separate threads. The SerialMergeScheduler (Lucene 2.2 default) does not. --> <!--<mergeScheduler>org.apache.lucene.index.ConcurrentMergeScheduler</mergeScheduler>--> <!-- This option specifies which Lucene LockFactory implementation to use. single = SingleInstanceLockFactory - suggested for a read-only index or when there is no possibility of another process trying to modify the index. native = NativeFSLockFactory simple = SimpleFSLockFactory (For backwards compatibility with Solr 1.2, 'simple' is the default if not specified.) --> <lockType>single</lockType> </indexDefaults> <mainIndex> <!-- options specific to the main on-disk lucene index --> <useCompoundFile>false</useCompoundFile> <ramBufferSizeMB>256</ramBufferSizeMB> <mergeFactor>10</mergeFactor> <!-- Deprecated --> <!--<maxBufferedDocs>1000</maxBufferedDocs>--> <maxMergeDocs>2147483647</maxMergeDocs> <maxFieldLength>10000</maxFieldLength> <!-- If true, unlock any held write or commit locks on startup. This defeats the locking mechanism that allows multiple processes to safely access a lucene index, and should be used with care. This is not needed if lock type is 'none' or 'single' --> <unlockOnStartup>false</unlockOnStartup> </mainIndex> <!-- Enables JMX if and only if an existing MBeanServer is found, use this if you want to configure JMX through JVM parameters. Remove this to disable exposing Solr configuration and statistics to JMX. If you want to connect to a particular server, specify the agentId e.g. <jmx agentId="myAgent" /> If you want to start a new MBeanServer, specify the serviceUrl e.g <jmx serviceurl="service:jmx:rmi:///jndi/rmi://localhost:9999/solr" /> For more details see http://wiki.apache.org/solr/SolrJmx --> <jmx /> <!-- the default high-performance update handler --> <updateHandler class="solr.DirectUpdateHandler2"> <!-- A prefix of "solr." for class names is an alias that causes solr to search appropriate packages, including org.apache.solr.(search|update|request|core|analysis) --> <!-- Perform a <commit/> automatically under certain conditions: maxDocs - number of updates since last commit is greater than this maxTime - oldest uncommited update (in ms) is this long ago --> <!-- <autoCommit> <maxDocs>1000</maxDocs> </autoCommit> --> <!-- The RunExecutableListener executes an external command. exe - the name of the executable to run dir - dir to use as the current working directory. default="." wait - the calling thread waits until the executable returns. default="true" args - the arguments to pass to the program. default=nothing env - environment variables to set. default=nothing --> <!-- A postCommit event is fired after every commit or optimize command <listener event="postCommit" class="solr.RunExecutableListener"> <str name="exe">solr/bin/snapshooter</str> <str name="dir">.</str> <bool name="wait">true</bool> <arr name="args"> <str>arg1</str> <str>arg2</str> </arr> <arr name="env"> <str>MYVAR=val1</str> </arr> </listener> --> <!-- A postOptimize event is fired only after every optimize command, useful in conjunction with index distribution to only distribute optimized indicies <listener event="postOptimize" class="solr.RunExecutableListener"> <str name="exe">snapshooter</str> <str name="dir">solr/bin</str> <bool name="wait">true</bool> </listener> --> </updateHandler> <query> <!-- Maximum number of clauses in a boolean query... can affect range or prefix queries that expand to big boolean queries. An exception is thrown if exceeded. --> <maxBooleanClauses>1024</maxBooleanClauses> <!-- Cache used by SolrIndexSearcher for filters (DocSets), unordered sets of *all* documents that match a query. When a new searcher is opened, its caches may be prepopulated or "autowarmed" using data from caches in the old searcher. autowarmCount is the number of items to prepopulate. For LRUCache, the autowarmed items will be the most recently accessed items. Parameters: class - the SolrCache implementation (currently only LRUCache) size - the maximum number of entries in the cache initialSize - the initial capacity (number of entries) of the cache. (seel java.util.HashMap) autowarmCount - the number of entries to prepopulate from and old cache. --> <filterCache class="solr.LRUCache" size="512" initialSize="512" autowarmCount="0"/> <!-- queryResultCache caches results of searches - ordered lists of document ids (DocList) based on a query, a sort, and the range of documents requested. --> <queryResultCache class="solr.LRUCache" size="512" initialSize="512" autowarmCount="0"/> <!-- documentCache caches Lucene Document objects (the stored fields for each document). Since Lucene internal document ids are transient, this cache will not be autowarmed. --> <documentCache class="solr.LRUCache" size="512" initialSize="512" autowarmCount="0"/> <!-- If true, stored fields that are not requested will be loaded lazily. This can result in a significant speed improvement if the usual case is to not load all stored fields, especially if the skipped fields are large compressed text fields. --> <enableLazyFieldLoading>false</enableLazyFieldLoading> <!-- Example of a generic cache. These caches may be accessed by name through SolrIndexSearcher.getCache(),cacheLookup(), and cacheInsert(). The purpose is to enable easy caching of user/application level data. The regenerator argument should be specified as an implementation of solr.search.CacheRegenerator if autowarming is desired. --> <!-- <cache name="myUserCache" class="solr.LRUCache" size="4096" initialSize="1024" autowarmCount="1024" regenerator="org.mycompany.mypackage.MyRegenerator" /> --> <!-- An optimization that attempts to use a filter to satisfy a search. If the requested sort does not include score, then the filterCache will be checked for a filter matching the query. If found, the filter will be used as the source of document ids, and then the sort will be applied to that. <useFilterForSortedQuery>true</useFilterForSortedQuery> --> <!-- An optimization for use with the queryResultCache. When a search is requested, a superset of the requested number of document ids are collected. For example, if a search for a particular query requests matching documents 10 through 19, and queryWindowSize is 50, then documents 0 through 49 will be collected and cached. Any further requests in that range can be satisfied via the cache. --> <queryResultWindowSize>50</queryResultWindowSize> <!-- Maximum number of documents to cache for any entry in the queryResultCache. --> <queryResultMaxDocsCached>200</queryResultMaxDocsCached> <!-- This entry enables an int hash representation for filters (DocSets) when the number of items in the set is less than maxSize. For smaller sets, this representation is more memory efficient, more efficient to iterate over, and faster to take intersections. --> <HashDocSet maxSize="3000" loadFactor="0.75"/> <!-- a newSearcher event is fired whenever a new searcher is being prepared and there is a current searcher handling requests (aka registered). --> <!-- QuerySenderListener takes an array of NamedList and executes a local query request for each NamedList in sequence. --> <listener event="newSearcher" class="solr.QuerySenderListener"> <arr name="queries"> <lst> <str name="q">solr</str> <str name="start">0</str> <str name="rows">10</str> </lst> <lst> <str name="q">rocks</str> <str name="start">0</str> <str name="rows">10</str> </lst> <lst><str name="q">static newSearcher warming query from solrconfig.xml</str></lst> </arr> </listener> <!-- a firstSearcher event is fired whenever a new searcher is being prepared but there is no current registered searcher to handle requests or to gain autowarming data from. --> <listener event="firstSearcher" class="solr.QuerySenderListener"> <arr name="queries"> <lst> <str name="q">fast_warm</str> <str name="start">0</str> <str name="rows">10</str> </lst> <lst><str name="q">static firstSearcher warming query from solrconfig.xml</str></lst> </arr> </listener> <!-- If a search request comes in and there is no current registered searcher, then immediately register the still warming searcher and use it. If "false" then all requests will block until the first searcher is done warming. --> <useColdSearcher>true</useColdSearcher> <!-- Maximum number of searchers that may be warming in the background concurrently. An error is returned if this limit is exceeded. Recommend 1-2 for read-only slaves, higher for masters w/o cache warming. --> <maxWarmingSearchers>2</maxWarmingSearchers> </query> <!-- Let the dispatch filter handler /select?qt=XXX handleSelect=true will use consistent error handling for /select and /update handleSelect=false will use solr1.1 style error formatting --> <requestDispatcher handleSelect="true" > <!--Make sure your system has some authentication before enabling remote streaming! --> <requestParsers enableRemoteStreaming="false" multipartUploadLimitInKB="2048" /> <!-- Set HTTP caching related parameters (for proxy caches and clients). To get the behaviour of Solr 1.2 (ie: no caching related headers) use the never304="true" option and do not specify a value for <cacheControl> --> <!-- <httpCaching never304="true"> --> <httpCaching lastModifiedFrom="openTime" etagSeed="Solr"> <!-- lastModFrom="openTime" is the default, the Last-Modified value (and validation against If-Modified-Since requests) will all be relative to when the current Searcher was opened. You can change it to lastModFrom="dirLastMod" if you want the value to exactly corrispond to when the physical index was last modified. etagSeed="..." is an option you can change to force the ETag header (and validation against If-None-Match requests) to be differnet even if the index has not changed (ie: when making significant changes to your config file) lastModifiedFrom and etagSeed are both ignored if you use the never304="true" option. --> <!-- If you include a <cacheControl> directive, it will be used to generate a Cache-Control header, as well as an Expires header if the value contains "max-age=" By default, no Cache-Control header is generated. You can use the <cacheControl> option even if you have set never304="true" --> <!-- <cacheControl>max-age=30, public</cacheControl> --> </httpCaching> </requestDispatcher> <!-- requestHandler plugins... incoming queries will be dispatched to the correct handler based on the path or the qt (query type) param. Names starting with a '/' are accessed with the a path equal to the registered name. Names without a leading '/' are accessed with: http://host/app/select?qt=name If no qt is defined, the requestHandler that declares default="true" will be used. --> <requestHandler name="standard" class="solr.SearchHandler" default="true"> <!-- default values for query parameters --> <lst name="defaults"> <str name="echoParams">explicit</str> <!-- <int name="rows">10</int> <str name="fl">*</str> <str name="version">2.1</str> --> </lst> </requestHandler> <!-- DisMaxRequestHandler allows easy searching across multiple fields for simple user-entered phrases. It's implementation is now just the standard SearchHandler with a default query type of "dismax". see http://wiki.apache.org/solr/DisMaxRequestHandler --> <requestHandler name="dismax" class="solr.SearchHandler" > <lst name="defaults"> <str name="defType">dismax</str> <str name="echoParams">explicit</str> <float name="tie">0.01</float> <str name="qf"> text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4 </str> <str name="pf"> text^0.2 features^1.1 name^1.5 manu^1.4 manu_exact^1.9 </str> <str name="bf"> ord(popularity)^0.5 recip(rord(price),1,1000,1000)^0.3 </str> <str name="fl"> id,name,price,score </str> <str name="mm"> 2<-1 5<-2 6<90% </str> <int name="ps">100</int> <str name="q.alt">*:*</str> <!-- example highlighter config, enable per-query with hl=true --> <str name="hl.fl">text features name</str> <!-- for this field, we want no fragmenting, just highlighting --> <str name="f.name.hl.fragsize">0</str> <!-- instructs Solr to return the field itself if no query terms are found --> <str name="f.name.hl.alternateField">name</str> <str name="f.text.hl.fragmenter">regex</str> <!-- defined below --> </lst> </requestHandler> <!-- Note how you can register the same handler multiple times with different names (and different init parameters) --> <requestHandler name="partitioned" class="solr.SearchHandler" > <lst name="defaults"> <str name="defType">dismax</str> <str name="echoParams">explicit</str> <str name="qf">text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0</str> <str name="mm">2<-1 5<-2 6<90%</str> <!-- This is an example of using Date Math to specify a constantly moving date range in a config... --> <str name="bq">incubationdate_dt:[* TO NOW/DAY-1MONTH]^2.2</str> </lst> <!-- In addition to defaults, "appends" params can be specified to identify values which should be appended to the list of multi-val params from the query (or the existing "defaults"). In this example, the param "fq=instock:true" will be appended to any query time fq params the user may specify, as a mechanism for partitioning the index, independent of any user selected filtering that may also be desired (perhaps as a result of faceted searching). NOTE: there is *absolutely* nothing a client can do to prevent these "appends" values from being used, so don't use this mechanism unless you are sure you always want it. --> <lst name="appends"> <str name="fq">inStock:true</str> </lst> <!-- "invariants" are a way of letting the Solr maintainer lock down the options available to Solr clients. Any params values specified here are used regardless of what values may be specified in either the query, the "defaults", or the "appends" params. In this example, the facet.field and facet.query params are fixed, limiting the facets clients can use. Faceting is not turned on by default - but if the client does specify facet=true in the request, these are the only facets they will be able to see counts for; regardless of what other facet.field or facet.query params they may specify. NOTE: there is *absolutely* nothing a client can do to prevent these "invariants" values from being used, so don't use this mechanism unless you are sure you always want it. --> <lst name="invariants"> <str name="facet.field">cat</str> <str name="facet.field">manu_exact</str> <str name="facet.query">price:[* TO 500]</str> <str name="facet.query">price:[500 TO *]</str> </lst> </requestHandler> <!-- Search components are registered to SolrCore and used by Search Handlers By default, the following components are avaliable: <searchComponent name="query" class="org.apache.solr.handler.component.QueryComponent" /> <searchComponent name="facet" class="org.apache.solr.handler.component.FacetComponent" /> <searchComponent name="mlt" class="org.apache.solr.handler.component.MoreLikeThisComponent" /> <searchComponent name="highlight" class="org.apache.solr.handler.component.HighlightComponent" /> <searchComponent name="debug" class="org.apache.solr.handler.component.DebugComponent" /> Default configuration in a requestHandler would look like: <arr name="components"> <str>query</str> <str>facet</str> <str>mlt</str> <str>highlight</str> <str>debug</str> </arr> If you register a searchComponent to one of the standard names, that will be used instead. To insert handlers before or after the 'standard' components, use: <arr name="first-components"> <str>myFirstComponentName</str> </arr> <arr name="last-components"> <str>myLastComponentName</str> </arr> --> <!-- The spell check component can return a list of alternative spelling suggestions. --> <searchComponent name="spellcheck" class="solr.SpellCheckComponent"> <str name="queryAnalyzerFieldType">textSpell</str> <lst name="spellchecker"> <str name="name">default</str> <str name="field">spell</str> <str name="spellcheckIndexDir">./spellchecker1</str> </lst> <lst name="spellchecker"> <str name="name">jarowinkler</str> <str name="field">spell</str> <!-- Use a different Distance Measure --> <str name="distanceMeasure">org.apache.lucene.search.spell.JaroWinklerDistance</str> <str name="spellcheckIndexDir">./spellchecker2</str> </lst> <lst name="spellchecker"> <str name="classname">solr.FileBasedSpellChecker</str> <str name="name">file</str> <str name="sourceLocation">spellings.txt</str> <str name="characterEncoding">UTF-8</str> <str name="spellcheckIndexDir">./spellcheckerFile</str> </lst> </searchComponent> <!-- a request handler utilizing the spellcheck component --> <requestHandler name="/spellCheckCompRH" class="solr.SearchHandler"> <lst name="defaults"> <!-- omp = Only More Popular --> <str name="spellcheck.onlyMorePopular">false</str> <!-- exr = Extended Results --> <str name="spellcheck.extendedResults">false</str> <!-- The number of suggestions to return --> <str name="spellcheck.count">1</str> </lst> <arr name="last-components"> <str>spellcheck</str> </arr> </requestHandler> <!-- a search component that enables you to configure the top results for a given query regardless of the normal lucene scoring.--> <searchComponent name="elevator" class="solr.QueryElevationComponent" > <!-- pick a fieldType to analyze queries --> <str name="queryFieldType">string</str> <str name="config-file">elevate.xml</str> </searchComponent> <!-- a request handler utilizing the elevator component --> <requestHandler name="/elevate" class="solr.SearchHandler" startup="lazy"> <lst name="defaults"> <str name="echoParams">explicit</str> </lst> <arr name="last-components"> <str>elevator</str> </arr> </requestHandler> <!-- Update request handler. Note: Since solr1.1 requestHandlers requires a valid content type header if posted in the body. For example, curl now requires: -H 'Content-type:text/xml; charset=utf-8' The response format differs from solr1.1 formatting and returns a standard error code. To enable solr1.1 behavior, remove the /update handler or change its path --> <requestHandler name="/update" class="solr.XmlUpdateRequestHandler" /> <!-- Analysis request handler. Since Solr 1.3. Use to returnhow a document is analyzed. Useful for debugging and as a token server for other types of applications --> <requestHandler name="/analysis" class="solr.AnalysisRequestHandler" /> <!-- CSV update handler, loaded on demand --> <requestHandler name="/update/csv" class="solr.CSVRequestHandler" startup="lazy" /> <!-- Admin Handlers - This will register all the standard admin RequestHandlers. Adding this single handler is equivolent to registering: <requestHandler name="/admin/luke" class="org.apache.solr.handler.admin.LukeRequestHandler" /> <requestHandler name="/admin/system" class="org.apache.solr.handler.admin.SystemInfoHandler" /> <requestHandler name="/admin/plugins" class="org.apache.solr.handler.admin.PluginInfoHandler" /> <requestHandler name="/admin/threads" class="org.apache.solr.handler.admin.ThreadDumpHandler" /> <requestHandler name="/admin/properties" class="org.apache.solr.handler.admin.PropertiesRequestHandler" /> <requestHandler name="/admin/file" class="org.apache.solr.handler.admin.ShowFileRequestHandler" > If you wish to hide files under ${solr.home}/conf, explicitly register the ShowFileRequestHandler using: <requestHandler name="/admin/file" class="org.apache.solr.handler.admin.ShowFileRequestHandler" > <lst name="invariants"> <str name="hidden">synonyms.txt</str> <str name="hidden">anotherfile.txt</str> </lst> </requestHandler> --> <requestHandler name="/admin/" class="org.apache.solr.handler.admin.AdminHandlers" /> <!-- ping/healthcheck --> <requestHandler name="/admin/ping" class="PingRequestHandler"> <lst name="defaults"> <str name="qt">standard</str> <str name="q">solrpingquery</str> <str name="echoParams">all</str> </lst> </requestHandler> <!-- Echo the request contents back to the client --> <requestHandler name="/debug/dump" class="solr.DumpRequestHandler" > <lst name="defaults"> <str name="echoParams">explicit</str> <!-- for all params (including the default etc) use: 'all' --> <str name="echoHandler">true</str> </lst> </requestHandler> <highlighting> <!-- Configure the standard fragmenter --> <!-- This could most likely be commented out in the "default" case --> <fragmenter name="gap" class="org.apache.solr.highlight.GapFragmenter" default="true"> <lst name="defaults"> <int name="hl.fragsize">100</int> </lst> </fragmenter> <!-- A regular-expression-based fragmenter (f.i., for sentence extraction) --> <fragmenter name="regex" class="org.apache.solr.highlight.RegexFragmenter"> <lst name="defaults"> <!-- slightly smaller fragsizes work better because of slop --> <int name="hl.fragsize">70</int> <!-- allow 50% slop on fragment sizes --> <float name="hl.regex.slop">0.5</float> <!-- a basic sentence pattern --> <str name="hl.regex.pattern">[-\w ,/\n\"']{20,200}</str> </lst> </fragmenter> <!-- Configure the standard formatter --> <formatter name="html" class="org.apache.solr.highlight.HtmlFormatter" default="true"> <lst name="defaults"> <str name="hl.simple.pre"><![CDATA[<em>]]></str> <str name="hl.simple.post"><![CDATA[</em>]]></str> </lst> </formatter> </highlighting> <!-- queryResponseWriter plugins... query responses will be written using the writer specified by the 'wt' request parameter matching the name of a registered writer. The "default" writer is the default and will be used if 'wt' is not specified in the request. XMLResponseWriter will be used if nothing is specified here. The json, python, and ruby writers are also available by default. <queryResponseWriter name="xml" class="org.apache.solr.request.XMLResponseWriter" default="true"/> <queryResponseWriter name="json" class="org.apache.solr.request.JSONResponseWriter"/> <queryResponseWriter name="python" class="org.apache.solr.request.PythonResponseWriter"/> <queryResponseWriter name="ruby" class="org.apache.solr.request.RubyResponseWriter"/> <queryResponseWriter name="php" class="org.apache.solr.request.PHPResponseWriter"/> <queryResponseWriter name="phps" class="org.apache.solr.request.PHPSerializedResponseWriter"/> <queryResponseWriter name="custom" class="com.example.MyResponseWriter"/> --> <!-- XSLT response writer transforms the XML output by any xslt file found in Solr's conf/xslt directory. Changes to xslt files are checked for every xsltCacheLifetimeSeconds. --> <queryResponseWriter name="xslt" class="org.apache.solr.request.XSLTResponseWriter"> <int name="xsltCacheLifetimeSeconds">5</int> </queryResponseWriter> <!-- example of registering a query parser <queryParser name="lucene" class="org.apache.solr.search.LuceneQParserPlugin"/> --> <!-- example of registering a custom function parser <valueSourceParser name="myfunc" class="com.mycompany.MyValueSourceParser" /> --> <!-- config for the admin interface --> <admin> <defaultQuery>solr</defaultQuery> <!-- configure a healthcheck file for servers behind a loadbalancer <healthcheck type="file">server-enabled</healthcheck> --> </admin> </config> <?xml version="1.0" encoding="UTF-8" ?> <!-- Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --> <config> <!-- Set this to 'false' if you want solr to continue working after it has encountered an severe configuration error. In a production environment, you may want solr to keep working even if one handler is mis-configured. You may also set this to false using by setting the system property: -Dsolr.abortOnConfigurationError=false --> <abortOnConfigurationError>${solr.abortOnConfigurationError:true}</abortOnConfigurationError> <!-- Used to specify an alternate directory to hold all index data other than the default ./data under the Solr home. If replication is in use, this should match the replication configuration. --> <dataDir>${solr.data.dir:/opt/solr/solr_slave/solr/data}</dataDir> <indexDefaults> <!-- Values here affect all index writers and act as a default unless overridden. --> <useCompoundFile>false</useCompoundFile> <!-- <mergeFactor>10</mergeFactor> --> <!-- If both ramBufferSizeMB and maxBufferedDocs is set, then Lucene will flush based on whichever limit is hit first. --> <!--<maxBufferedDocs>1000</maxBufferedDocs>--> <!-- Tell Lucene when to flush documents to disk. Giving Lucene more memory for indexing means faster indexing at the cost of more RAM If both ramBufferSizeMB and maxBufferedDocs is set, then Lucene will flush based on whichever limit is hit first. --> <ramBufferSizeMB>256</ramBufferSizeMB> <maxMergeDocs>2147483647</maxMergeDocs> <maxFieldLength>10000</maxFieldLength> <writeLockTimeout>1000</writeLockTimeout> <commitLockTimeout>10000</commitLockTimeout> <!-- Expert: Turn on Lucene's auto commit capability. This causes intermediate segment flushes to write a new lucene index descriptor, enabling it to be opened by an external IndexReader. NOTE: Despite the name, this value does not have any relation to Solr's autoCommit functionality --> <!--<luceneAutoCommit>false</luceneAutoCommit>--> <!-- Expert: The Merge Policy in Lucene controls how merging is handled by Lucene. The default in 2.3 is the LogByteSizeMergePolicy, previous versions used LogDocMergePolicy. LogByteSizeMergePolicy chooses segments to merge based on their size. The Lucene 2.2 default, LogDocMergePolicy chose when to merge based on number of documents Other implementations of MergePolicy must have a no-argument constructor --> <!--<mergePolicy>org.apache.lucene.index.LogByteSizeMergePolicy</mergePolicy>--> <!-- Expert: The Merge Scheduler in Lucene controls how merges are performed. The ConcurrentMergeScheduler (Lucene 2.3 default) can perform merges in the background using separate threads. The SerialMergeScheduler (Lucene 2.2 default) does not. --> <!--<mergeScheduler>org.apache.lucene.index.ConcurrentMergeScheduler</mergeScheduler>--> <!-- This option specifies which Lucene LockFactory implementation to use. single = SingleInstanceLockFactory - suggested for a read-only index or when there is no possibility of another process trying to modify the index. native = NativeFSLockFactory simple = SimpleFSLockFactory (For backwards compatibility with Solr 1.2, 'simple' is the default if not specified.) --> <lockType>single</lockType> </indexDefaults> <mainIndex> <!-- options specific to the main on-disk lucene index --> <useCompoundFile>false</useCompoundFile> <ramBufferSizeMB>256</ramBufferSizeMB> <mergeFactor>10</mergeFactor> <!-- Deprecated --> <!--<maxBufferedDocs>1000</maxBufferedDocs>--> <maxMergeDocs>2147483647</maxMergeDocs> <maxFieldLength>10000</maxFieldLength> <!-- If true, unlock any held write or commit locks on startup. This defeats the locking mechanism that allows multiple processes to safely access a lucene index, and should be used with care. This is not needed if lock type is 'none' or 'single' --> <unlockOnStartup>false</unlockOnStartup> </mainIndex> <!-- Enables JMX if and only if an existing MBeanServer is found, use this if you want to configure JMX through JVM parameters. Remove this to disable exposing Solr configuration and statistics to JMX. If you want to connect to a particular server, specify the agentId e.g. <jmx agentId="myAgent" /> If you want to start a new MBeanServer, specify the serviceUrl e.g <jmx serviceurl="service:jmx:rmi:///jndi/rmi://localhost:9999/solr" /> For more details see http://wiki.apache.org/solr/SolrJmx --> <jmx /> <!-- the default high-performance update handler --> <updateHandler class="solr.DirectUpdateHandler2"> <!-- A prefix of "solr." for class names is an alias that causes solr to search appropriate packages, including org.apache.solr.(search|update|request|core|analysis) --> <!-- Perform a <commit/> automatically under certain conditions: maxDocs - number of updates since last commit is greater than this maxTime - oldest uncommited update (in ms) is this long ago --> <!-- <autoCommit> <maxDocs>1000</maxDocs> </autoCommit> --> <!-- The RunExecutableListener executes an external command. exe - the name of the executable to run dir - dir to use as the current working directory. default="." wait - the calling thread waits until the executable returns. default="true" args - the arguments to pass to the program. default=nothing env - environment variables to set. default=nothing --> <!-- A postCommit event is fired after every commit or optimize command <listener event="postCommit" class="solr.RunExecutableListener"> <str name="exe">solr/bin/snapshooter</str> <str name="dir">.</str> <bool name="wait">true</bool> <arr name="args"> <str>arg1</str> <str>arg2</str> </arr> <arr name="env"> <str>MYVAR=val1</str> </arr> </listener> --> <!-- A postOptimize event is fired only after every optimize command, useful in conjunction with index distribution to only distribute optimized indicies <listener event="postOptimize" class="solr.RunExecutableListener"> <str name="exe">snapshooter</str> <str name="dir">solr/bin</str> <bool name="wait">true</bool> </listener> --> </updateHandler> <query> <!-- Maximum number of clauses in a boolean query... can affect range or prefix queries that expand to big boolean queries. An exception is thrown if exceeded. --> <maxBooleanClauses>1024</maxBooleanClauses> <!-- Cache used by SolrIndexSearcher for filters (DocSets), unordered sets of *all* documents that match a query. When a new searcher is opened, its caches may be prepopulated or "autowarmed" using data from caches in the old searcher. autowarmCount is the number of items to prepopulate. For LRUCache, the autowarmed items will be the most recently accessed items. Parameters: class - the SolrCache implementation (currently only LRUCache) size - the maximum number of entries in the cache initialSize - the initial capacity (number of entries) of the cache. (seel java.util.HashMap) autowarmCount - the number of entries to prepopulate from and old cache. --> <filterCache class="solr.LRUCache" size="512" initialSize="512" autowarmCount="0"/> <!-- queryResultCache caches results of searches - ordered lists of document ids (DocList) based on a query, a sort, and the range of documents requested. --> <queryResultCache class="solr.LRUCache" size="512" initialSize="512" autowarmCount="0"/> <!-- documentCache caches Lucene Document objects (the stored fields for each document). Since Lucene internal document ids are transient, this cache will not be autowarmed. --> <documentCache class="solr.LRUCache" size="512" initialSize="512" autowarmCount="0"/> <!-- If true, stored fields that are not requested will be loaded lazily. This can result in a significant speed improvement if the usual case is to not load all stored fields, especially if the skipped fields are large compressed text fields. --> <enableLazyFieldLoading>false</enableLazyFieldLoading> <!-- Example of a generic cache. These caches may be accessed by name through SolrIndexSearcher.getCache(),cacheLookup(), and cacheInsert(). The purpose is to enable easy caching of user/application level data. The regenerator argument should be specified as an implementation of solr.search.CacheRegenerator if autowarming is desired. --> <!-- <cache name="myUserCache" class="solr.LRUCache" size="4096" initialSize="1024" autowarmCount="1024" regenerator="org.mycompany.mypackage.MyRegenerator" /> --> <!-- An optimization that attempts to use a filter to satisfy a search. If the requested sort does not include score, then the filterCache will be checked for a filter matching the query. If found, the filter will be used as the source of document ids, and then the sort will be applied to that. <useFilterForSortedQuery>true</useFilterForSortedQuery> --> <!-- An optimization for use with the queryResultCache. When a search is requested, a superset of the requested number of document ids are collected. For example, if a search for a particular query requests matching documents 10 through 19, and queryWindowSize is 50, then documents 0 through 49 will be collected and cached. Any further requests in that range can be satisfied via the cache. --> <queryResultWindowSize>50</queryResultWindowSize> <!-- Maximum number of documents to cache for any entry in the queryResultCache. --> <queryResultMaxDocsCached>200</queryResultMaxDocsCached> <!-- This entry enables an int hash representation for filters (DocSets) when the number of items in the set is less than maxSize. For smaller sets, this representation is more memory efficient, more efficient to iterate over, and faster to take intersections. --> <HashDocSet maxSize="3000" loadFactor="0.75"/> <!-- a newSearcher event is fired whenever a new searcher is being prepared and there is a current searcher handling requests (aka registered). --> <!-- QuerySenderListener takes an array of NamedList and executes a local query request for each NamedList in sequence. --> <listener event="newSearcher" class="solr.QuerySenderListener"> <arr name="queries"> <lst> <str name="q">solr</str> <str name="start">0</str> <str name="rows">10</str> </lst> <lst> <str name="q">rocks</str> <str name="start">0</str> <str name="rows">10</str> </lst> <lst><str name="q">static newSearcher warming query from solrconfig.xml</str></lst> </arr> </listener> <!-- a firstSearcher event is fired whenever a new searcher is being prepared but there is no current registered searcher to handle requests or to gain autowarming data from. --> <listener event="firstSearcher" class="solr.QuerySenderListener"> <arr name="queries"> <lst> <str name="q">fast_warm</str> <str name="start">0</str> <str name="rows">10</str> </lst> <lst><str name="q">static firstSearcher warming query from solrconfig.xml</str></lst> </arr> </listener> <!-- If a search request comes in and there is no current registered searcher, then immediately register the still warming searcher and use it. If "false" then all requests will block until the first searcher is done warming. --> <useColdSearcher>true</useColdSearcher> <!-- Maximum number of searchers that may be warming in the background concurrently. An error is returned if this limit is exceeded. Recommend 1-2 for read-only slaves, higher for masters w/o cache warming. --> <maxWarmingSearchers>2</maxWarmingSearchers> </query> <!-- Let the dispatch filter handler /select?qt=XXX handleSelect=true will use consistent error handling for /select and /update handleSelect=false will use solr1.1 style error formatting --> <requestDispatcher handleSelect="true" > <!--Make sure your system has some authentication before enabling remote streaming! --> <requestParsers enableRemoteStreaming="false" multipartUploadLimitInKB="2048" /> <!-- Set HTTP caching related parameters (for proxy caches and clients). To get the behaviour of Solr 1.2 (ie: no caching related headers) use the never304="true" option and do not specify a value for <cacheControl> --> <!-- <httpCaching never304="true"> --> <httpCaching lastModifiedFrom="openTime" etagSeed="Solr"> <!-- lastModFrom="openTime" is the default, the Last-Modified value (and validation against If-Modified-Since requests) will all be relative to when the current Searcher was opened. You can change it to lastModFrom="dirLastMod" if you want the value to exactly corrispond to when the physical index was last modified. etagSeed="..." is an option you can change to force the ETag header (and validation against If-None-Match requests) to be differnet even if the index has not changed (ie: when making significant changes to your config file) lastModifiedFrom and etagSeed are both ignored if you use the never304="true" option. --> <!-- If you include a <cacheControl> directive, it will be used to generate a Cache-Control header, as well as an Expires header if the value contains "max-age=" By default, no Cache-Control header is generated. You can use the <cacheControl> option even if you have set never304="true" --> <!-- <cacheControl>max-age=30, public</cacheControl> --> </httpCaching> </requestDispatcher> <!-- requestHandler plugins... incoming queries will be dispatched to the correct handler based on the path or the qt (query type) param. Names starting with a '/' are accessed with the a path equal to the registered name. Names without a leading '/' are accessed with: http://host/app/select?qt=name If no qt is defined, the requestHandler that declares default="true" will be used. --> <requestHandler name="standard" class="solr.SearchHandler" default="true"> <!-- default values for query parameters --> <lst name="defaults"> <str name="echoParams">explicit</str> <!-- <int name="rows">10</int> <str name="fl">*</str> <str name="version">2.1</str> --> </lst> </requestHandler> <!-- DisMaxRequestHandler allows easy searching across multiple fields for simple user-entered phrases. It's implementation is now just the standard SearchHandler with a default query type of "dismax". see http://wiki.apache.org/solr/DisMaxRequestHandler --> <requestHandler name="dismax" class="solr.SearchHandler" > <lst name="defaults"> <str name="defType">dismax</str> <str name="echoParams">explicit</str> <float name="tie">0.01</float> <str name="qf"> text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4 </str> <str name="pf"> text^0.2 features^1.1 name^1.5 manu^1.4 manu_exact^1.9 </str> <str name="bf"> ord(popularity)^0.5 recip(rord(price),1,1000,1000)^0.3 </str> <str name="fl"> id,name,price,score </str> <str name="mm"> 2<-1 5<-2 6<90% </str> <int name="ps">100</int> <str name="q.alt">*:*</str> <!-- example highlighter config, enable per-query with hl=true --> <str name="hl.fl">text features name</str> <!-- for this field, we want no fragmenting, just highlighting --> <str name="f.name.hl.fragsize">0</str> <!-- instructs Solr to return the field itself if no query terms are found --> <str name="f.name.hl.alternateField">name</str> <str name="f.text.hl.fragmenter">regex</str> <!-- defined below --> </lst> </requestHandler> <!-- Note how you can register the same handler multiple times with different names (and different init parameters) --> <requestHandler name="partitioned" class="solr.SearchHandler" > <lst name="defaults"> <str name="defType">dismax</str> <str name="echoParams">explicit</str> <str name="qf">text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0</str> <str name="mm">2<-1 5<-2 6<90%</str> <!-- This is an example of using Date Math to specify a constantly moving date range in a config... --> <str name="bq">incubationdate_dt:[* TO NOW/DAY-1MONTH]^2.2</str> </lst> <!-- In addition to defaults, "appends" params can be specified to identify values which should be appended to the list of multi-val params from the query (or the existing "defaults"). In this example, the param "fq=instock:true" will be appended to any query time fq params the user may specify, as a mechanism for partitioning the index, independent of any user selected filtering that may also be desired (perhaps as a result of faceted searching). NOTE: there is *absolutely* nothing a client can do to prevent these "appends" values from being used, so don't use this mechanism unless you are sure you always want it. --> <lst name="appends"> <str name="fq">inStock:true</str> </lst> <!-- "invariants" are a way of letting the Solr maintainer lock down the options available to Solr clients. Any params values specified here are used regardless of what values may be specified in either the query, the "defaults", or the "appends" params. In this example, the facet.field and facet.query params are fixed, limiting the facets clients can use. Faceting is not turned on by default - but if the client does specify facet=true in the request, these are the only facets they will be able to see counts for; regardless of what other facet.field or facet.query params they may specify. NOTE: there is *absolutely* nothing a client can do to prevent these "invariants" values from being used, so don't use this mechanism unless you are sure you always want it. --> <lst name="invariants"> <str name="facet.field">cat</str> <str name="facet.field">manu_exact</str> <str name="facet.query">price:[* TO 500]</str> <str name="facet.query">price:[500 TO *]</str> </lst> </requestHandler> <!-- Search components are registered to SolrCore and used by Search Handlers By default, the following components are avaliable: <searchComponent name="query" class="org.apache.solr.handler.component.QueryComponent" /> <searchComponent name="facet" class="org.apache.solr.handler.component.FacetComponent" /> <searchComponent name="mlt" class="org.apache.solr.handler.component.MoreLikeThisComponent" /> <searchComponent name="highlight" class="org.apache.solr.handler.component.HighlightComponent" /> <searchComponent name="debug" class="org.apache.solr.handler.component.DebugComponent" /> Default configuration in a requestHandler would look like: <arr name="components"> <str>query</str> <str>facet</str> <str>mlt</str> <str>highlight</str> <str>debug</str> </arr> If you register a searchComponent to one of the standard names, that will be used instead. To insert handlers before or after the 'standard' components, use: <arr name="first-components"> <str>myFirstComponentName</str> </arr> <arr name="last-components"> <str>myLastComponentName</str> </arr> --> <!-- The spell check component can return a list of alternative spelling suggestions. --> <searchComponent name="spellcheck" class="solr.SpellCheckComponent"> <str name="queryAnalyzerFieldType">textSpell</str> <lst name="spellchecker"> <str name="name">default</str> <str name="field">spell</str> <str name="spellcheckIndexDir">./spellchecker1</str> </lst> <lst name="spellchecker"> <str name="name">jarowinkler</str> <str name="field">spell</str> <!-- Use a different Distance Measure --> <str name="distanceMeasure">org.apache.lucene.search.spell.JaroWinklerDistance</str> <str name="spellcheckIndexDir">./spellchecker2</str> </lst> <lst name="spellchecker"> <str name="classname">solr.FileBasedSpellChecker</str> <str name="name">file</str> <str name="sourceLocation">spellings.txt</str> <str name="characterEncoding">UTF-8</str> <str name="spellcheckIndexDir">./spellcheckerFile</str> </lst> </searchComponent> <!-- a request handler utilizing the spellcheck component --> <requestHandler name="/spellCheckCompRH" class="solr.SearchHandler"> <lst name="defaults"> <!-- omp = Only More Popular --> <str name="spellcheck.onlyMorePopular">false</str> <!-- exr = Extended Results --> <str name="spellcheck.extendedResults">false</str> <!-- The number of suggestions to return --> <str name="spellcheck.count">1</str> </lst> <arr name="last-components"> <str>spellcheck</str> </arr> </requestHandler> <!-- a search component that enables you to configure the top results for a given query regardless of the normal lucene scoring.--> <searchComponent name="elevator" class="solr.QueryElevationComponent" > <!-- pick a fieldType to analyze queries --> <str name="queryFieldType">string</str> <str name="config-file">elevate.xml</str> </searchComponent> <!-- a request handler utilizing the elevator component --> <requestHandler name="/elevate" class="solr.SearchHandler" startup="lazy"> <lst name="defaults"> <str name="echoParams">explicit</str> </lst> <arr name="last-components"> <str>elevator</str> </arr> </requestHandler> <!-- Update request handler. Note: Since solr1.1 requestHandlers requires a valid content type header if posted in the body. For example, curl now requires: -H 'Content-type:text/xml; charset=utf-8' The response format differs from solr1.1 formatting and returns a standard error code. To enable solr1.1 behavior, remove the /update handler or change its path --> <requestHandler name="/update" class="solr.XmlUpdateRequestHandler" /> <!-- Analysis request handler. Since Solr 1.3. Use to returnhow a document is analyzed. Useful for debugging and as a token server for other types of applications --> <requestHandler name="/analysis" class="solr.AnalysisRequestHandler" /> <!-- CSV update handler, loaded on demand --> <requestHandler name="/update/csv" class="solr.CSVRequestHandler" startup="lazy" /> <!-- Admin Handlers - This will register all the standard admin RequestHandlers. Adding this single handler is equivolent to registering: <requestHandler name="/admin/luke" class="org.apache.solr.handler.admin.LukeRequestHandler" /> <requestHandler name="/admin/system" class="org.apache.solr.handler.admin.SystemInfoHandler" /> <requestHandler name="/admin/plugins" class="org.apache.solr.handler.admin.PluginInfoHandler" /> <requestHandler name="/admin/threads" class="org.apache.solr.handler.admin.ThreadDumpHandler" /> <requestHandler name="/admin/properties" class="org.apache.solr.handler.admin.PropertiesRequestHandler" /> <requestHandler name="/admin/file" class="org.apache.solr.handler.admin.ShowFileRequestHandler" > If you wish to hide files under ${solr.home}/conf, explicitly register the ShowFileRequestHandler using: <requestHandler name="/admin/file" class="org.apache.solr.handler.admin.ShowFileRequestHandler" > <lst name="invariants"> <str name="hidden">synonyms.txt</str> <str name="hidden">anotherfile.txt</str> </lst> </requestHandler> --> <requestHandler name="/admin/" class="org.apache.solr.handler.admin.AdminHandlers" /> <!-- ping/healthcheck --> <requestHandler name="/admin/ping" class="PingRequestHandler"> <lst name="defaults"> <str name="qt">standard</str> <str name="q">solrpingquery</str> <str name="echoParams">all</str> </lst> </requestHandler> <!-- Echo the request contents back to the client --> <requestHandler name="/debug/dump" class="solr.DumpRequestHandler" > <lst name="defaults"> <str name="echoParams">explicit</str> <!-- for all params (including the default etc) use: 'all' --> <str name="echoHandler">true</str> </lst> </requestHandler> <highlighting> <!-- Configure the standard fragmenter --> <!-- This could most likely be commented out in the "default" case --> <fragmenter name="gap" class="org.apache.solr.highlight.GapFragmenter" default="true"> <lst name="defaults"> <int name="hl.fragsize">100</int> </lst> </fragmenter> <!-- A regular-expression-based fragmenter (f.i., for sentence extraction) --> <fragmenter name="regex" class="org.apache.solr.highlight.RegexFragmenter"> <lst name="defaults"> <!-- slightly smaller fragsizes work better because of slop --> <int name="hl.fragsize">70</int> <!-- allow 50% slop on fragment sizes --> <float name="hl.regex.slop">0.5</float> <!-- a basic sentence pattern --> <str name="hl.regex.pattern">[-\w ,/\n\"']{20,200}</str> </lst> </fragmenter> <!-- Configure the standard formatter --> <formatter name="html" class="org.apache.solr.highlight.HtmlFormatter" default="true"> <lst name="defaults"> <str name="hl.simple.pre"><![CDATA[<em>]]></str> <str name="hl.simple.post"><![CDATA[</em>]]></str> </lst> </formatter> </highlighting> <!-- queryResponseWriter plugins... query responses will be written using the writer specified by the 'wt' request parameter matching the name of a registered writer. The "default" writer is the default and will be used if 'wt' is not specified in the request. XMLResponseWriter will be used if nothing is specified here. The json, python, and ruby writers are also available by default. <queryResponseWriter name="xml" class="org.apache.solr.request.XMLResponseWriter" default="true"/> <queryResponseWriter name="json" class="org.apache.solr.request.JSONResponseWriter"/> <queryResponseWriter name="python" class="org.apache.solr.request.PythonResponseWriter"/> <queryResponseWriter name="ruby" class="org.apache.solr.request.RubyResponseWriter"/> <queryResponseWriter name="php" class="org.apache.solr.request.PHPResponseWriter"/> <queryResponseWriter name="phps" class="org.apache.solr.request.PHPSerializedResponseWriter"/> <queryResponseWriter name="custom" class="com.example.MyResponseWriter"/> --> <!-- XSLT response writer transforms the XML output by any xslt file found in Solr's conf/xslt directory. Changes to xslt files are checked for every xsltCacheLifetimeSeconds. --> <queryResponseWriter name="xslt" class="org.apache.solr.request.XSLTResponseWriter"> <int name="xsltCacheLifetimeSeconds">5</int> </queryResponseWriter> <!-- example of registering a query parser <queryParser name="lucene" class="org.apache.solr.search.LuceneQParserPlugin"/> --> <!-- example of registering a custom function parser <valueSourceParser name="myfunc" class="com.mycompany.MyValueSourceParser" /> --> <!-- config for the admin interface --> <admin> <defaultQuery>solr</defaultQuery> <!-- configure a healthcheck file for servers behind a loadbalancer <healthcheck type="file">server-enabled</healthcheck> --> </admin> </config> <?xml version="1.0" encoding="UTF-8" ?> <!-- Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --> <config> <!-- Set this to 'false' if you want solr to continue working after it has encountered an severe configuration error. In a production environment, you may want solr to keep working even if one handler is mis-configured. You may also set this to false using by setting the system property: -Dsolr.abortOnConfigurationError=false --> <abortOnConfigurationError>${solr.abortOnConfigurationError:true}</abortOnConfigurationError> <!-- Used to specify an alternate directory to hold all index data other than the default ./data under the Solr home. If replication is in use, this should match the replication configuration. --> <dataDir>${solr.data.dir:/opt/solr/solr_master/solr/data}</dataDir> <indexDefaults> <!-- Values here affect all index writers and act as a default unless overridden. --> <useCompoundFile>false</useCompoundFile> <mergeFactor>10</mergeFactor> <!-- If both ramBufferSizeMB and maxBufferedDocs is set, then Lucene will flush based on whichever limit is hit first. --> <!--<maxBufferedDocs>1000</maxBufferedDocs>--> <!-- Tell Lucene when to flush documents to disk. Giving Lucene more memory for indexing means faster indexing at the cost of more RAM If both ramBufferSizeMB and maxBufferedDocs is set, then Lucene will flush based on whichever limit is hit first. --> <ramBufferSizeMB>32</ramBufferSizeMB> <maxMergeDocs>2147483647</maxMergeDocs> <maxFieldLength>10000</maxFieldLength> <writeLockTimeout>1000</writeLockTimeout> <commitLockTimeout>10000</commitLockTimeout> <!-- Expert: Turn on Lucene's auto commit capability. This causes intermediate segment flushes to write a new lucene index descriptor, enabling it to be opened by an external IndexReader. NOTE: Despite the name, this value does not have any relation to Solr's autoCommit functionality --> <!--<luceneAutoCommit>false</luceneAutoCommit>--> <!-- Expert: The Merge Policy in Lucene controls how merging is handled by Lucene. The default in 2.3 is the LogByteSizeMergePolicy, previous versions used LogDocMergePolicy. LogByteSizeMergePolicy chooses segments to merge based on their size. The Lucene 2.2 default, LogDocMergePolicy chose when to merge based on number of documents Other implementations of MergePolicy must have a no-argument constructor --> <!--<mergePolicy>org.apache.lucene.index.LogByteSizeMergePolicy</mergePolicy>--> <!-- Expert: The Merge Scheduler in Lucene controls how merges are performed. The ConcurrentMergeScheduler (Lucene 2.3 default) can perform merges in the background using separate threads. The SerialMergeScheduler (Lucene 2.2 default) does not. --> <!--<mergeScheduler>org.apache.lucene.index.ConcurrentMergeScheduler</mergeScheduler>--> <!-- This option specifies which Lucene LockFactory implementation to use. single = SingleInstanceLockFactory - suggested for a read-only index or when there is no possibility of another process trying to modify the index. native = NativeFSLockFactory simple = SimpleFSLockFactory (For backwards compatibility with Solr 1.2, 'simple' is the default if not specified.) --> <lockType>single</lockType> </indexDefaults> <mainIndex> <!-- options specific to the main on-disk lucene index --> <useCompoundFile>false</useCompoundFile> <ramBufferSizeMB>32</ramBufferSizeMB> <mergeFactor>10</mergeFactor> <!-- Deprecated --> <!--<maxBufferedDocs>1000</maxBufferedDocs>--> <maxMergeDocs>2147483647</maxMergeDocs> <maxFieldLength>10000</maxFieldLength> <!-- If true, unlock any held write or commit locks on startup. This defeats the locking mechanism that allows multiple processes to safely access a lucene index, and should be used with care. This is not needed if lock type is 'none' or 'single' --> <unlockOnStartup>false</unlockOnStartup> </mainIndex> <!-- Enables JMX if and only if an existing MBeanServer is found, use this if you want to configure JMX through JVM parameters. Remove this to disable exposing Solr configuration and statistics to JMX. If you want to connect to a particular server, specify the agentId e.g. <jmx agentId="myAgent" /> If you want to start a new MBeanServer, specify the serviceUrl e.g <jmx serviceurl="service:jmx:rmi:///jndi/rmi://localhost:9999/solr" /> For more details see http://wiki.apache.org/solr/SolrJmx --> <jmx /> <!-- the default high-performance update handler --> <updateHandler class="solr.DirectUpdateHandler2"> <!-- A prefix of "solr." for class names is an alias that causes solr to search appropriate packages, including org.apache.solr.(search|update|request|core|analysis) --> <!-- Perform a <commit/> automatically under certain conditions: maxDocs - number of updates since last commit is greater than this maxTime - oldest uncommited update (in ms) is this long ago --> <autoCommit> <maxDocs>50</maxDocs> <maxTime>300000</maxTime> </autoCommit> <!-- The RunExecutableListener executes an external command. exe - the name of the executable to run dir - dir to use as the current working directory. default="." wait - the calling thread waits until the executable returns. default="true" args - the arguments to pass to the program. default=nothing env - environment variables to set. default=nothing --> <!-- A postCommit event is fired after every commit or optimize command--> <!-- <listener event="postCommit" class="solr.RunExecutableListener"> <str name="exe">/opt/solr/solr_master/solr/solr/bin/snapshooter</str> <str name="dir">.</str> <bool name="wait">true</bool> <arr name="args"> <str>arg1</str> <str>arg2</str> </arr> <arr name="env"> <str>MYVAR=val1</str> </arr> </listener> --> <!-- A postOptimize event is fired only after every optimize command, useful in conjunction with index distribution to only distribute optimized indicies --> <listener event="postOptimize" class="solr.RunExecutableListener"> <str name="exe">/opt/solr/solr_master/solr/solr/bin/snapshooter</str> <str name="dir">.</str> <bool name="wait">true</bool> </listener> </updateHandler> <query> <!-- Maximum number of clauses in a boolean query... can affect range or prefix queries that expand to big boolean queries. An exception is thrown if exceeded. --> <maxBooleanClauses>1024</maxBooleanClauses> <!-- Cache used by SolrIndexSearcher for filters (DocSets), unordered sets of *all* documents that match a query. When a new searcher is opened, its caches may be prepopulated or "autowarmed" using data from caches in the old searcher. autowarmCount is the number of items to prepopulate. For LRUCache, the autowarmed items will be the most recently accessed items. Parameters: class - the SolrCache implementation (currently only LRUCache) size - the maximum number of entries in the cache initialSize - the initial capacity (number of entries) of the cache. (seel java.util.HashMap) autowarmCount - the number of entries to prepopulate from and old cache. --> <!--<filterCache class="solr.LRUCache" size="512" initialSize="512" autowarmCount="128"/> --> <!-- queryResultCache caches results of searches - ordered lists of document ids (DocList) based on a query, a sort, and the range of documents requested. --> <!-- <queryResultCache class="solr.LRUCache" size="512" initialSize="512" autowarmCount="32"/> --> <!-- documentCache caches Lucene Document objects (the stored fields for each document). Since Lucene internal document ids are transient, this cache will not be autowarmed. --> <!-- <documentCache class="solr.LRUCache" size="512" initialSize="512" autowarmCount="0"/> --> <!-- If true, stored fields that are not requested will be loaded lazily. This can result in a significant speed improvement if the usual case is to not load all stored fields, especially if the skipped fields are large compressed text fields. --> <enableLazyFieldLoading>true</enableLazyFieldLoading> <!-- Example of a generic cache. These caches may be accessed by name through SolrIndexSearcher.getCache(),cacheLookup(), and cacheInsert(). The purpose is to enable easy caching of user/application level data. The regenerator argument should be specified as an implementation of solr.search.CacheRegenerator if autowarming is desired. --> <!-- <cache name="myUserCache" class="solr.LRUCache" size="4096" initialSize="1024" autowarmCount="1024" regenerator="org.mycompany.mypackage.MyRegenerator" /> --> <!-- An optimization that attempts to use a filter to satisfy a search. If the requested sort does not include score, then the filterCache will be checked for a filter matching the query. If found, the filter will be used as the source of document ids, and then the sort will be applied to that. <useFilterForSortedQuery>true</useFilterForSortedQuery> --> <!-- An optimization for use with the queryResultCache. When a search is requested, a superset of the requested number of document ids are collected. For example, if a search for a particular query requests matching documents 10 through 19, and queryWindowSize is 50, then documents 0 through 49 will be collected and cached. Any further requests in that range can be satisfied via the cache. --> <queryResultWindowSize>50</queryResultWindowSize> <!-- Maximum number of documents to cache for any entry in the queryResultCache. --> <queryResultMaxDocsCached>200</queryResultMaxDocsCached> <!-- This entry enables an int hash representation for filters (DocSets) when the number of items in the set is less than maxSize. For smaller sets, this representation is more memory efficient, more efficient to iterate over, and faster to take intersections. --> <HashDocSet maxSize="3000" loadFactor="0.75"/> <!-- a newSearcher event is fired whenever a new searcher is being prepared and there is a current searcher handling requests (aka registered). --> <!-- QuerySenderListener takes an array of NamedList and executes a local query request for each NamedList in sequence. --> <listener event="newSearcher" class="solr.QuerySenderListener"> <arr name="queries"> <lst> <str name="q">solr</str> <str name="start">0</str> <str name="rows">10</str> </lst> <lst> <str name="q">rocks</str> <str name="start">0</str> <str name="rows">10</str> </lst> <lst><str name="q">static newSearcher warming query from solrconfig.xml</str></lst> </arr> </listener> <!-- a firstSearcher event is fired whenever a new searcher is being prepared but there is no current registered searcher to handle requests or to gain autowarming data from. --> <listener event="firstSearcher" class="solr.QuerySenderListener"> <arr name="queries"> <lst> <str name="q">fast_warm</str> <str name="start">0</str> <str name="rows">10</str> </lst> <lst><str name="q">static firstSearcher warming query from solrconfig.xml</str></lst> </arr> </listener> <!-- If a search request comes in and there is no current registered searcher, then immediately register the still warming searcher and use it. If "false" then all requests will block until the first searcher is done warming. --> <useColdSearcher>false</useColdSearcher> <!-- Maximum number of searchers that may be warming in the background concurrently. An error is returned if this limit is exceeded. Recommend 1-2 for read-only slaves, higher for masters w/o cache warming. --> <maxWarmingSearchers>2</maxWarmingSearchers> </query> <!-- Let the dispatch filter handler /select?qt=XXX handleSelect=true will use consistent error handling for /select and /update handleSelect=false will use solr1.1 style error formatting --> <requestDispatcher handleSelect="true" > <!--Make sure your system has some authentication before enabling remote streaming! --> <requestParsers enableRemoteStreaming="false" multipartUploadLimitInKB="2048" /> <!-- Set HTTP caching related parameters (for proxy caches and clients). To get the behaviour of Solr 1.2 (ie: no caching related headers) use the never304="true" option and do not specify a value for <cacheControl> --> <!-- <httpCaching never304="true"> --> <httpCaching lastModifiedFrom="openTime" etagSeed="Solr"> <!-- lastModFrom="openTime" is the default, the Last-Modified value (and validation against If-Modified-Since requests) will all be relative to when the current Searcher was opened. You can change it to lastModFrom="dirLastMod" if you want the value to exactly corrispond to when the physical index was last modified. etagSeed="..." is an option you can change to force the ETag header (and validation against If-None-Match requests) to be differnet even if the index has not changed (ie: when making significant changes to your config file) lastModifiedFrom and etagSeed are both ignored if you use the never304="true" option. --> <!-- If you include a <cacheControl> directive, it will be used to generate a Cache-Control header, as well as an Expires header if the value contains "max-age=" By default, no Cache-Control header is generated. You can use the <cacheControl> option even if you have set never304="true" --> <!-- <cacheControl>max-age=30, public</cacheControl> --> </httpCaching> </requestDispatcher> <!-- requestHandler plugins... incoming queries will be dispatched to the correct handler based on the path or the qt (query type) param. Names starting with a '/' are accessed with the a path equal to the registered name. Names without a leading '/' are accessed with: http://host/app/select?qt=name If no qt is defined, the requestHandler that declares default="true" will be used. --> <requestHandler name="standard" class="solr.SearchHandler" default="true"> <!-- default values for query parameters --> <lst name="defaults"> <str name="echoParams">explicit</str> <!-- <int name="rows">10</int> <str name="fl">*</str> <str name="version">2.1</str> --> </lst> </requestHandler> <!-- DisMaxRequestHandler allows easy searching across multiple fields for simple user-entered phrases. It's implementation is now just the standard SearchHandler with a default query type of "dismax". see http://wiki.apache.org/solr/DisMaxRequestHandler --> <requestHandler name="dismax" class="solr.SearchHandler" > <lst name="defaults"> <str name="defType">dismax</str> <str name="echoParams">explicit</str> <float name="tie">0.01</float> <str name="qf"> text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4 </str> <str name="pf"> text^0.2 features^1.1 name^1.5 manu^1.4 manu_exact^1.9 </str> <str name="bf"> ord(popularity)^0.5 recip(rord(price),1,1000,1000)^0.3 </str> <str name="fl"> id,name,price,score </str> <str name="mm"> 2<-1 5<-2 6<90% </str> <int name="ps">100</int> <str name="q.alt">*:*</str> <!-- example highlighter config, enable per-query with hl=true --> <str name="hl.fl">text features name</str> <!-- for this field, we want no fragmenting, just highlighting --> <str name="f.name.hl.fragsize">0</str> <!-- instructs Solr to return the field itself if no query terms are found --> <str name="f.name.hl.alternateField">name</str> <str name="f.text.hl.fragmenter">regex</str> <!-- defined below --> </lst> </requestHandler> <!-- Note how you can register the same handler multiple times with different names (and different init parameters) --> <requestHandler name="partitioned" class="solr.SearchHandler" > <lst name="defaults"> <str name="defType">dismax</str> <str name="echoParams">explicit</str> <str name="qf">text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0</str> <str name="mm">2<-1 5<-2 6<90%</str> <!-- This is an example of using Date Math to specify a constantly moving date range in a config... --> <str name="bq">incubationdate_dt:[* TO NOW/DAY-1MONTH]^2.2</str> </lst> <!-- In addition to defaults, "appends" params can be specified to identify values which should be appended to the list of multi-val params from the query (or the existing "defaults"). In this example, the param "fq=instock:true" will be appended to any query time fq params the user may specify, as a mechanism for partitioning the index, independent of any user selected filtering that may also be desired (perhaps as a result of faceted searching). NOTE: there is *absolutely* nothing a client can do to prevent these "appends" values from being used, so don't use this mechanism unless you are sure you always want it. --> <lst name="appends"> <str name="fq">inStock:true</str> </lst> <!-- "invariants" are a way of letting the Solr maintainer lock down the options available to Solr clients. Any params values specified here are used regardless of what values may be specified in either the query, the "defaults", or the "appends" params. In this example, the facet.field and facet.query params are fixed, limiting the facets clients can use. Faceting is not turned on by default - but if the client does specify facet=true in the request, these are the only facets they will be able to see counts for; regardless of what other facet.field or facet.query params they may specify. NOTE: there is *absolutely* nothing a client can do to prevent these "invariants" values from being used, so don't use this mechanism unless you are sure you always want it. --> <lst name="invariants"> <str name="facet.field">cat</str> <str name="facet.field">manu_exact</str> <str name="facet.query">price:[* TO 500]</str> <str name="facet.query">price:[500 TO *]</str> </lst> </requestHandler> <!-- Search components are registered to SolrCore and used by Search Handlers By default, the following components are avaliable: <searchComponent name="query" class="org.apache.solr.handler.component.QueryComponent" /> <searchComponent name="facet" class="org.apache.solr.handler.component.FacetComponent" /> <searchComponent name="mlt" class="org.apache.solr.handler.component.MoreLikeThisComponent" /> <searchComponent name="highlight" class="org.apache.solr.handler.component.HighlightComponent" /> <searchComponent name="debug" class="org.apache.solr.handler.component.DebugComponent" /> Default configuration in a requestHandler would look like: <arr name="components"> <str>query</str> <str>facet</str> <str>mlt</str> <str>highlight</str> <str>debug</str> </arr> If you register a searchComponent to one of the standard names, that will be used instead. To insert handlers before or after the 'standard' components, use: <arr name="first-components"> <str>myFirstComponentName</str> </arr> <arr name="last-components"> <str>myLastComponentName</str> </arr> --> <!-- The spell check component can return a list of alternative spelling suggestions. --> <searchComponent name="spellcheck" class="solr.SpellCheckComponent"> <str name="queryAnalyzerFieldType">textSpell</str> <lst name="spellchecker"> <str name="name">default</str> <str name="field">spell</str> <str name="spellcheckIndexDir">./spellchecker1</str> </lst> <lst name="spellchecker"> <str name="name">jarowinkler</str> <str name="field">spell</str> <!-- Use a different Distance Measure --> <str name="distanceMeasure">org.apache.lucene.search.spell.JaroWinklerDistance</str> <str name="spellcheckIndexDir">./spellchecker2</str> </lst> <lst name="spellchecker"> <str name="classname">solr.FileBasedSpellChecker</str> <str name="name">file</str> <str name="sourceLocation">spellings.txt</str> <str name="characterEncoding">UTF-8</str> <str name="spellcheckIndexDir">./spellcheckerFile</str> </lst> </searchComponent> <!-- a request handler utilizing the spellcheck component --> <requestHandler name="/spellCheckCompRH" class="solr.SearchHandler"> <lst name="defaults"> <!-- omp = Only More Popular --> <str name="spellcheck.onlyMorePopular">false</str> <!-- exr = Extended Results --> <str name="spellcheck.extendedResults">false</str> <!-- The number of suggestions to return --> <str name="spellcheck.count">1</str> </lst> <arr name="last-components"> <str>spellcheck</str> </arr> </requestHandler> <!-- a search component that enables you to configure the top results for a given query regardless of the normal lucene scoring.--> <searchComponent name="elevator" class="solr.QueryElevationComponent" > <!-- pick a fieldType to analyze queries --> <str name="queryFieldType">string</str> <str name="config-file">elevate.xml</str> </searchComponent> <!-- a request handler utilizing the elevator component --> <requestHandler name="/elevate" class="solr.SearchHandler" startup="lazy"> <lst name="defaults"> <str name="echoParams">explicit</str> </lst> <arr name="last-components"> <str>elevator</str> </arr> </requestHandler> <!-- Update request handler. Note: Since solr1.1 requestHandlers requires a valid content type header if posted in the body. For example, curl now requires: -H 'Content-type:text/xml; charset=utf-8' The response format differs from solr1.1 formatting and returns a standard error code. To enable solr1.1 behavior, remove the /update handler or change its path --> <requestHandler name="/update" class="solr.XmlUpdateRequestHandler" /> <!-- Analysis request handler. Since Solr 1.3. Use to returnhow a document is analyzed. Useful for debugging and as a token server for other types of applications --> <requestHandler name="/analysis" class="solr.AnalysisRequestHandler" /> <!-- CSV update handler, loaded on demand --> <requestHandler name="/update/csv" class="solr.CSVRequestHandler" startup="lazy" /> <!-- Admin Handlers - This will register all the standard admin RequestHandlers. Adding this single handler is equivolent to registering: <requestHandler name="/admin/luke" class="org.apache.solr.handler.admin.LukeRequestHandler" /> <requestHandler name="/admin/system" class="org.apache.solr.handler.admin.SystemInfoHandler" /> <requestHandler name="/admin/plugins" class="org.apache.solr.handler.admin.PluginInfoHandler" /> <requestHandler name="/admin/threads" class="org.apache.solr.handler.admin.ThreadDumpHandler" /> <requestHandler name="/admin/properties" class="org.apache.solr.handler.admin.PropertiesRequestHandler" /> <requestHandler name="/admin/file" class="org.apache.solr.handler.admin.ShowFileRequestHandler" > If you wish to hide files under ${solr.home}/conf, explicitly register the ShowFileRequestHandler using: <requestHandler name="/admin/file" class="org.apache.solr.handler.admin.ShowFileRequestHandler" > <lst name="invariants"> <str name="hidden">synonyms.txt</str> <str name="hidden">anotherfile.txt</str> </lst> </requestHandler> --> <requestHandler name="/admin/" class="org.apache.solr.handler.admin.AdminHandlers" /> <!-- ping/healthcheck --> <requestHandler name="/admin/ping" class="PingRequestHandler"> <lst name="defaults"> <str name="qt">standard</str> <str name="q">solrpingquery</str> <str name="echoParams">all</str> </lst> </requestHandler> <!-- Echo the request contents back to the client --> <requestHandler name="/debug/dump" class="solr.DumpRequestHandler" > <lst name="defaults"> <str name="echoParams">explicit</str> <!-- for all params (including the default etc) use: 'all' --> <str name="echoHandler">true</str> </lst> </requestHandler> <highlighting> <!-- Configure the standard fragmenter --> <!-- This could most likely be commented out in the "default" case --> <fragmenter name="gap" class="org.apache.solr.highlight.GapFragmenter" default="true"> <lst name="defaults"> <int name="hl.fragsize">100</int> </lst> </fragmenter> <!-- A regular-expression-based fragmenter (f.i., for sentence extraction) --> <fragmenter name="regex" class="org.apache.solr.highlight.RegexFragmenter"> <lst name="defaults"> <!-- slightly smaller fragsizes work better because of slop --> <int name="hl.fragsize">70</int> <!-- allow 50% slop on fragment sizes --> <float name="hl.regex.slop">0.5</float> <!-- a basic sentence pattern --> <str name="hl.regex.pattern">[-\w ,/\n\"']{20,200}</str> </lst> </fragmenter> <!-- Configure the standard formatter --> <formatter name="html" class="org.apache.solr.highlight.HtmlFormatter" default="true"> <lst name="defaults"> <str name="hl.simple.pre"><![CDATA[<em>]]></str> <str name="hl.simple.post"><![CDATA[</em>]]></str> </lst> </formatter> </highlighting> <!-- queryResponseWriter plugins... query responses will be written using the writer specified by the 'wt' request parameter matching the name of a registered writer. The "default" writer is the default and will be used if 'wt' is not specified in the request. XMLResponseWriter will be used if nothing is specified here. The json, python, and ruby writers are also available by default. <queryResponseWriter name="xml" class="org.apache.solr.request.XMLResponseWriter" default="true"/> <queryResponseWriter name="json" class="org.apache.solr.request.JSONResponseWriter"/> <queryResponseWriter name="python" class="org.apache.solr.request.PythonResponseWriter"/> <queryResponseWriter name="ruby" class="org.apache.solr.request.RubyResponseWriter"/> <queryResponseWriter name="php" class="org.apache.solr.request.PHPResponseWriter"/> <queryResponseWriter name="phps" class="org.apache.solr.request.PHPSerializedResponseWriter"/> <queryResponseWriter name="custom" class="com.example.MyResponseWriter"/> --> <!-- XSLT response writer transforms the XML output by any xslt file found in Solr's conf/xslt directory. Changes to xslt files are checked for every xsltCacheLifetimeSeconds. --> <queryResponseWriter name="xslt" class="org.apache.solr.request.XSLTResponseWriter"> <int name="xsltCacheLifetimeSeconds">5</int> </queryResponseWriter> <!-- example of registering a query parser <queryParser name="lucene" class="org.apache.solr.search.LuceneQParserPlugin"/> --> <!-- example of registering a custom function parser <valueSourceParser name="myfunc" class="com.mycompany.MyValueSourceParser" /> --> <!-- config for the admin interface --> <admin> <defaultQuery>solr</defaultQuery> <!-- configure a healthcheck file for servers behind a loadbalancer <healthcheck type="file">server-enabled</healthcheck> --> </admin> </config> |
| Free embeddable forum powered by Nabble | Forum Help |