|
View:
New views
8 Messages
—
Rating Filter:
Alert me
|
|
|
field queries seem slowI took a look through my Solr logs this weekend and noticed that the longest
queries were on particular fields, like "author:albert einstein". Is this a result consistent with other setups out there? If not, Is there a trick to make these go faster? I've read up on filter queries and use those when applicable, but they don't really solve all my problems. If anybody wants to take a shot at it but needs to see my solrconfig, etc just let me know. Cheers, Mike |
|
|
Re: field queries seem slowHmmmm, are you sorting? And has your readers been reopened? Is the
second query of that sort also slow? If the answer to this last question is "no", have you tried some autowarming queries? Best Erick On Mon, Nov 2, 2009 at 4:34 PM, mike anderson <saidtherobot@...>wrote: > I took a look through my Solr logs this weekend and noticed that the > longest > queries were on particular fields, like "author:albert einstein". Is this a > result consistent with other setups out there? If not, Is there a trick to > make these go faster? I've read up on filter queries and use those when > applicable, but they don't really solve all my problems. > > If anybody wants to take a shot at it but needs to see my solrconfig, etc > just let me know. > > Cheers, > Mike > |
|
|
Re: field queries seem slowThis searches author:albert and (default text field): einstein. This
may not be what you expect? On Mon, Nov 2, 2009 at 2:30 PM, Erick Erickson <erickerickson@...> wrote: > Hmmmm, are you sorting? And has your readers been reopened? Is the > second query of that sort also slow? If the answer to this last question is > "no", > have you tried some autowarming queries? > > Best > Erick > > On Mon, Nov 2, 2009 at 4:34 PM, mike anderson <saidtherobot@...>wrote: > >> I took a look through my Solr logs this weekend and noticed that the >> longest >> queries were on particular fields, like "author:albert einstein". Is this a >> result consistent with other setups out there? If not, Is there a trick to >> make these go faster? I've read up on filter queries and use those when >> applicable, but they don't really solve all my problems. >> >> If anybody wants to take a shot at it but needs to see my solrconfig, etc >> just let me know. >> >> Cheers, >> Mike >> > -- Lance Norskog goksron@... |
|
|
Re: field queries seem slowErik, we are doing a sort by date first, and then by score. I'm not sure
what you mean by readers. Since we have nearly 6M authors attached to our 20M documents I'm not sure that autowarming would help that much (especially since we have very little overlap in what users are searching for). But maybe it would? Lance, I was just being a bit lazy. thanks though. -mike On Mon, Nov 2, 2009 at 10:27 PM, Lance Norskog <goksron@...> wrote: > This searches author:albert and (default text field): einstein. This > may not be what you expect? > > On Mon, Nov 2, 2009 at 2:30 PM, Erick Erickson <erickerickson@...> > wrote: > > Hmmmm, are you sorting? And has your readers been reopened? Is the > > second query of that sort also slow? If the answer to this last question > is > > "no", > > have you tried some autowarming queries? > > > > Best > > Erick > > > > On Mon, Nov 2, 2009 at 4:34 PM, mike anderson <saidtherobot@... > >wrote: > > > >> I took a look through my Solr logs this weekend and noticed that the > >> longest > >> queries were on particular fields, like "author:albert einstein". Is > this a > >> result consistent with other setups out there? If not, Is there a trick > to > >> make these go faster? I've read up on filter queries and use those when > >> applicable, but they don't really solve all my problems. > >> > >> If anybody wants to take a shot at it but needs to see my solrconfig, > etc > >> just let me know. > >> > >> Cheers, > >> Mike > >> > > > > > > -- > Lance Norskog > goksron@... > |
|
|
Re: field queries seem slowBy readers, I meant your searchers. Perhaps you were shutting
down your servers? The warming isn't to pre-load authors, it's to pre-populate, particularly, sort fields. Which are then kept in caches. There is considerable overhead in loading the sort field the first time you sort by it. So, my question was really based on the chance that "over the weekend" corresponded to "the first queries after the server restarted", or "the first query after the underlying index searchers were (re)opened. The real question comes down to whether the same form of query (i.e. searching for different values on the same fields with the same kind of sort) is slow all the time or just when things start up. How fine is the resolution for your dates? Assuming that the sorting is the issue, if you are storing dates in the millisecond range, that's probably 20M dates that have to be loaded to sort. You might want to think about a coarser resolution if this has any relevance. HTH Erick On Wed, Nov 4, 2009 at 1:54 PM, mike anderson <saidtherobot@...>wrote: > Erik, we are doing a sort by date first, and then by score. I'm not sure > what you mean by readers. > > Since we have nearly 6M authors attached to our 20M documents I'm not sure > that autowarming would help that much (especially since we have very little > overlap in what users are searching for). But maybe it would? > > Lance, I was just being a bit lazy. thanks though. > > -mike > > > On Mon, Nov 2, 2009 at 10:27 PM, Lance Norskog <goksron@...> wrote: > > > This searches author:albert and (default text field): einstein. This > > may not be what you expect? > > > > On Mon, Nov 2, 2009 at 2:30 PM, Erick Erickson <erickerickson@...> > > wrote: > > > Hmmmm, are you sorting? And has your readers been reopened? Is the > > > second query of that sort also slow? If the answer to this last > question > > is > > > "no", > > > have you tried some autowarming queries? > > > > > > Best > > > Erick > > > > > > On Mon, Nov 2, 2009 at 4:34 PM, mike anderson <saidtherobot@... > > >wrote: > > > > > >> I took a look through my Solr logs this weekend and noticed that the > > >> longest > > >> queries were on particular fields, like "author:albert einstein". Is > > this a > > >> result consistent with other setups out there? If not, Is there a > trick > > to > > >> make these go faster? I've read up on filter queries and use those > when > > >> applicable, but they don't really solve all my problems. > > >> > > >> If anybody wants to take a shot at it but needs to see my solrconfig, > > etc > > >> just let me know. > > >> > > >> Cheers, > > >> Mike > > >> > > > > > > > > > > > -- > > Lance Norskog > > goksron@... > > > |
|
|
Re: field queries seem slowOn production our servers are restarted very rarely (once a month). But this
raises a question, what does it take to clear the cache? On my benchmarking platform I've been simply restarting the server as a method of starting fresh. Is there a cache file I could delete to make sure I'm getting unbiased results? Second of all, is there an internal cache for sort fields separate from the cache for queries and filters which has settings found in the solrconfig.xml file? I did a test as you suggested to determine if that type of query is always slow or just when it starts up, it seems that it is only slow when it starts up. However, it seems to be slow when it starts up with and without sorting. (I'm still trying to figure out how to do good benchmarking with one independent variable, so it's possible that this result is inconsistent) for reference, my query is looking like this (+/- sort field): http://10.0.20.174:8986/solr/select?mlt=false&rows=10&shards=localhost:8986/solr,localhost:8986/solr,localhost:8986/solr&q=abbrev_authors%3A%22Gallinger+S%22 I like the suggestion on date resolution, we definitely don't need second accuracy (which it is now), and in fact I think we'll just start stamping documents with year/week and then sort by that. thanks for all your help! Cheers, Mike On Wed, Nov 4, 2009 at 2:07 PM, Erick Erickson <erickerickson@...>wrote: > By readers, I meant your searchers. Perhaps you were shutting > down your servers? > > The warming isn't to pre-load authors, it's to pre-populate, particularly, > sort fields. Which are then kept in caches. There is considerable > overhead in loading the sort field the first time you sort by it. So, > my question was really based on the chance that "over the > weekend" corresponded to "the first queries after the server > restarted", or "the first query after the underlying index searchers > were (re)opened. > > The real question comes down to whether the same form of query > (i.e. searching for different values on the same fields with the > same kind of sort) is slow all the time or just when things start up. > > How fine is the resolution for your dates? Assuming that the sorting > is the issue, if you are storing dates in the millisecond range, that's > probably 20M dates that have to be loaded to sort. You might > want to think about a coarser resolution if this has any relevance. > > HTH > Erick > > On Wed, Nov 4, 2009 at 1:54 PM, mike anderson <saidtherobot@... > >wrote: > > > Erik, we are doing a sort by date first, and then by score. I'm not sure > > what you mean by readers. > > > > Since we have nearly 6M authors attached to our 20M documents I'm not > sure > > that autowarming would help that much (especially since we have very > little > > overlap in what users are searching for). But maybe it would? > > > > Lance, I was just being a bit lazy. thanks though. > > > > -mike > > > > > > On Mon, Nov 2, 2009 at 10:27 PM, Lance Norskog <goksron@...> > wrote: > > > > > This searches author:albert and (default text field): einstein. This > > > may not be what you expect? > > > > > > On Mon, Nov 2, 2009 at 2:30 PM, Erick Erickson < > erickerickson@...> > > > wrote: > > > > Hmmmm, are you sorting? And has your readers been reopened? Is the > > > > second query of that sort also slow? If the answer to this last > > question > > > is > > > > "no", > > > > have you tried some autowarming queries? > > > > > > > > Best > > > > Erick > > > > > > > > On Mon, Nov 2, 2009 at 4:34 PM, mike anderson < > saidtherobot@... > > > >wrote: > > > > > > > >> I took a look through my Solr logs this weekend and noticed that the > > > >> longest > > > >> queries were on particular fields, like "author:albert einstein". Is > > > this a > > > >> result consistent with other setups out there? If not, Is there a > > trick > > > to > > > >> make these go faster? I've read up on filter queries and use those > > when > > > >> applicable, but they don't really solve all my problems. > > > >> > > > >> If anybody wants to take a shot at it but needs to see my > solrconfig, > > > etc > > > >> just let me know. > > > >> > > > >> Cheers, > > > >> Mike > > > >> > > > > > > > > > > > > > > > > -- > > > Lance Norskog > > > goksron@... > > > > > > |
|
|
Re: field queries seem slowHi,
There is no way that I know to clear Solr's caches (query, document, filter caches). FIeldCache is a Lucene thing and it's also something you can't clear, as far as I know. Slowness on start could be due to: * OS not cached the index yet (would be the case if your Solr was down for a while and its index got displaced from the OS buffers) * sort query run for the first time, FieldCache not populated yet * expensive query run for the first time, its results and hits not cached in Solr caches * ... Otis -- Sematext is hiring -- http://sematext.com/about/jobs.html?mls Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR ----- Original Message ---- > From: mike anderson <saidtherobot@...> > To: solr-user@... > Sent: Thu, November 5, 2009 11:34:59 AM > Subject: Re: field queries seem slow > > On production our servers are restarted very rarely (once a month). But this > raises a question, what does it take to clear the cache? On my benchmarking > platform I've been simply restarting the server as a method of starting > fresh. Is there a cache file I could delete to make sure I'm getting > unbiased results? Second of all, is there an internal cache for sort fields > separate from the cache for queries and filters which has settings found in > the solrconfig.xml file? > > I did a test as you suggested to determine if that type of query is always > slow or just when it starts up, it seems that it is only slow when it starts > up. However, it seems to be slow when it starts up with and without sorting. > (I'm still trying to figure out how to do good benchmarking with one > independent variable, so it's possible that this result is inconsistent) > > for reference, my query is looking like this (+/- sort field): > > http://10.0.20.174:8986/solr/select?mlt=false&rows=10&shards=localhost:8986/solr,localhost:8986/solr,localhost:8986/solr&q=abbrev_authors%3A%22Gallinger+S%22 > > I like the suggestion on date resolution, we definitely don't need second > accuracy (which it is now), and in fact I think we'll just start stamping > documents with year/week and then sort by that. > > > thanks for all your help! > > Cheers, > Mike > > > > On Wed, Nov 4, 2009 at 2:07 PM, Erick Erickson wrote: > > > By readers, I meant your searchers. Perhaps you were shutting > > down your servers? > > > > The warming isn't to pre-load authors, it's to pre-populate, particularly, > > sort fields. Which are then kept in caches. There is considerable > > overhead in loading the sort field the first time you sort by it. So, > > my question was really based on the chance that "over the > > weekend" corresponded to "the first queries after the server > > restarted", or "the first query after the underlying index searchers > > were (re)opened. > > > > The real question comes down to whether the same form of query > > (i.e. searching for different values on the same fields with the > > same kind of sort) is slow all the time or just when things start up. > > > > How fine is the resolution for your dates? Assuming that the sorting > > is the issue, if you are storing dates in the millisecond range, that's > > probably 20M dates that have to be loaded to sort. You might > > want to think about a coarser resolution if this has any relevance. > > > > HTH > > Erick > > > > On Wed, Nov 4, 2009 at 1:54 PM, mike anderson > > >wrote: > > > > > Erik, we are doing a sort by date first, and then by score. I'm not sure > > > what you mean by readers. > > > > > > Since we have nearly 6M authors attached to our 20M documents I'm not > > sure > > > that autowarming would help that much (especially since we have very > > little > > > overlap in what users are searching for). But maybe it would? > > > > > > Lance, I was just being a bit lazy. thanks though. > > > > > > -mike > > > > > > > > > On Mon, Nov 2, 2009 at 10:27 PM, Lance Norskog > > wrote: > > > > > > > This searches author:albert and (default text field): einstein. This > > > > may not be what you expect? > > > > > > > > On Mon, Nov 2, 2009 at 2:30 PM, Erick Erickson < > > erickerickson@...> > > > > wrote: > > > > > Hmmmm, are you sorting? And has your readers been reopened? Is the > > > > > second query of that sort also slow? If the answer to this last > > > question > > > > is > > > > > "no", > > > > > have you tried some autowarming queries? > > > > > > > > > > Best > > > > > Erick > > > > > > > > > > On Mon, Nov 2, 2009 at 4:34 PM, mike anderson < > > saidtherobot@... > > > > >wrote: > > > > > > > > > >> I took a look through my Solr logs this weekend and noticed that the > > > > >> longest > > > > >> queries were on particular fields, like "author:albert einstein". Is > > > > this a > > > > >> result consistent with other setups out there? If not, Is there a > > > trick > > > > to > > > > >> make these go faster? I've read up on filter queries and use those > > > when > > > > >> applicable, but they don't really solve all my problems. > > > > >> > > > > >> If anybody wants to take a shot at it but needs to see my > > solrconfig, > > > > etc > > > > >> just let me know. > > > > >> > > > > >> Cheers, > > > > >> Mike > > > > >> > > > > > > > > > > > > > > > > > > > > > -- > > > > Lance Norskog > > > > goksron@... > > > > > > > > > |
|
|
Re: field queries seem slowRestarting Solr clears out all caching.
Doing a commit used to drop all of the caches for new requests, but it no longer does this. On Linux you can clear the kernel's disk buffer cache with a special hook. You echo '1' into a /proc/something and this tells the kernel to drop its caches. Sorry, don't remember the exact command. On Thu, Nov 5, 2009 at 10:09 AM, Otis Gospodnetic <otis_gospodnetic@...> wrote: > Hi, > > There is no way that I know to clear Solr's caches (query, document, filter caches). > FIeldCache is a Lucene thing and it's also something you can't clear, as far as I know. > > Slowness on start could be due to: > > * OS not cached the index yet (would be the case if your Solr was down for a while and its index got displaced from the OS buffers) > * sort query run for the first time, FieldCache not populated yet > * expensive query run for the first time, its results and hits not cached in Solr caches > > * ... > > Otis > > -- > Sematext is hiring -- http://sematext.com/about/jobs.html?mls > Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR > > > > ----- Original Message ---- >> From: mike anderson <saidtherobot@...> >> To: solr-user@... >> Sent: Thu, November 5, 2009 11:34:59 AM >> Subject: Re: field queries seem slow >> >> On production our servers are restarted very rarely (once a month). But this >> raises a question, what does it take to clear the cache? On my benchmarking >> platform I've been simply restarting the server as a method of starting >> fresh. Is there a cache file I could delete to make sure I'm getting >> unbiased results? Second of all, is there an internal cache for sort fields >> separate from the cache for queries and filters which has settings found in >> the solrconfig.xml file? >> >> I did a test as you suggested to determine if that type of query is always >> slow or just when it starts up, it seems that it is only slow when it starts >> up. However, it seems to be slow when it starts up with and without sorting. >> (I'm still trying to figure out how to do good benchmarking with one >> independent variable, so it's possible that this result is inconsistent) >> >> for reference, my query is looking like this (+/- sort field): >> >> http://10.0.20.174:8986/solr/select?mlt=false&rows=10&shards=localhost:8986/solr,localhost:8986/solr,localhost:8986/solr&q=abbrev_authors%3A%22Gallinger+S%22 >> >> I like the suggestion on date resolution, we definitely don't need second >> accuracy (which it is now), and in fact I think we'll just start stamping >> documents with year/week and then sort by that. >> >> >> thanks for all your help! >> >> Cheers, >> Mike >> >> >> >> On Wed, Nov 4, 2009 at 2:07 PM, Erick Erickson wrote: >> >> > By readers, I meant your searchers. Perhaps you were shutting >> > down your servers? >> > >> > The warming isn't to pre-load authors, it's to pre-populate, particularly, >> > sort fields. Which are then kept in caches. There is considerable >> > overhead in loading the sort field the first time you sort by it. So, >> > my question was really based on the chance that "over the >> > weekend" corresponded to "the first queries after the server >> > restarted", or "the first query after the underlying index searchers >> > were (re)opened. >> > >> > The real question comes down to whether the same form of query >> > (i.e. searching for different values on the same fields with the >> > same kind of sort) is slow all the time or just when things start up. >> > >> > How fine is the resolution for your dates? Assuming that the sorting >> > is the issue, if you are storing dates in the millisecond range, that's >> > probably 20M dates that have to be loaded to sort. You might >> > want to think about a coarser resolution if this has any relevance. >> > >> > HTH >> > Erick >> > >> > On Wed, Nov 4, 2009 at 1:54 PM, mike anderson >> > >wrote: >> > >> > > Erik, we are doing a sort by date first, and then by score. I'm not sure >> > > what you mean by readers. >> > > >> > > Since we have nearly 6M authors attached to our 20M documents I'm not >> > sure >> > > that autowarming would help that much (especially since we have very >> > little >> > > overlap in what users are searching for). But maybe it would? >> > > >> > > Lance, I was just being a bit lazy. thanks though. >> > > >> > > -mike >> > > >> > > >> > > On Mon, Nov 2, 2009 at 10:27 PM, Lance Norskog >> > wrote: >> > > >> > > > This searches author:albert and (default text field): einstein. This >> > > > may not be what you expect? >> > > > >> > > > On Mon, Nov 2, 2009 at 2:30 PM, Erick Erickson < >> > erickerickson@...> >> > > > wrote: >> > > > > Hmmmm, are you sorting? And has your readers been reopened? Is the >> > > > > second query of that sort also slow? If the answer to this last >> > > question >> > > > is >> > > > > "no", >> > > > > have you tried some autowarming queries? >> > > > > >> > > > > Best >> > > > > Erick >> > > > > >> > > > > On Mon, Nov 2, 2009 at 4:34 PM, mike anderson < >> > saidtherobot@... >> > > > >wrote: >> > > > > >> > > > >> I took a look through my Solr logs this weekend and noticed that the >> > > > >> longest >> > > > >> queries were on particular fields, like "author:albert einstein". Is >> > > > this a >> > > > >> result consistent with other setups out there? If not, Is there a >> > > trick >> > > > to >> > > > >> make these go faster? I've read up on filter queries and use those >> > > when >> > > > >> applicable, but they don't really solve all my problems. >> > > > >> >> > > > >> If anybody wants to take a shot at it but needs to see my >> > solrconfig, >> > > > etc >> > > > >> just let me know. >> > > > >> >> > > > >> Cheers, >> > > > >> Mike >> > > > >> >> > > > > >> > > > >> > > > >> > > > >> > > > -- >> > > > Lance Norskog >> > > > goksron@... >> > > > >> > > >> > > > -- Lance Norskog goksron@... |
| Free embeddable forum powered by Nabble | Forum Help |