|
View:
New views
13 Messages
—
Rating Filter:
Alert me
|
|
|
quick jruby + solr benchmarksI'm starting to experiment with benchmarks/jruby + solr and just wanted to
get this out there -- getting ready for a week vacation :) In my solr-ruby 'refactoring' progress, I'm finding some interesting results and will try to post in the next few weeks. This is jruby 1.1.4 and solr 1.3 (empty index) -- using the standard Ruby "Benchmark" library. The script: # require 'java' require 'benchmark' solr_dist_root = File.expand_path(File.join(File.dirname(__FILE__), '..', 'apache-solr-1.3.0')) solr_home = File.join(solr_dist_root, 'example', 'solr') def require_jars(dir) jar_pattern = File.join(dir,"**", "*.jar") jar_files = Dir.glob(jar_pattern) jar_files.each {|jar_file| require jar_file} end def hash_to_params(hash_params) import org.apache.solr.common.params.ModifiableSolrParams query = ModifiableSolrParams.new query.instance_eval do alias _add add def add(field, values) _add(field.to_s, (values.is_a?(Array) ? values : [values]).to_java(:string)) end end hash_params.each_pair do |k,v| query.add k, v end query end require_jars(File.join(solr_dist_root, "lib")) require_jars(File.join(solr_dist_root, "dist")) # HttpCommons def http_commons @http_commons ||= ( import org.apache.solr.client.solrj.impl.CommonsHttpSolrServer import org.apache.solr.common.params.MapSolrParams solr = CommonsHttpSolrServer.new("http://localhost:8983/solr") ) end # EmbeddedSolrServer def embedded(solr_home) @embedded ||= ( import org.apache.solr.client.solrj.embedded.EmbeddedSolrServer import org.apache.solr.core.CoreContainer import org.apache.solr.core.CoreDescriptor import org.apache.solr.client.solrj.SolrQuery core_name = 'main-core' container = CoreContainer.new descriptor = CoreDescriptor.new(container, core_name, solr_home) core = container.create(descriptor) container.register(core_name, core, false) solr = EmbeddedSolrServer.new(container, core_name) ) end query = {'qt' => 'standard', 'q'=>'ipod', 'facet.field' => 'cat'} params = hash_to_params(query) max = 1000 Benchmark.bm do |x| x.report 'http commons' do max.times do http_commons.query(params) end end x.report 'embedded' do max.times do embedded(solr_home).query(params) end end end # THE RESULTS # http commons # 4.634000 0.000000 4.634000 ( 4.633849) # 4.454000 0.000000 4.454000 ( 4.453764) # 3.908000 0.000000 3.908000 ( 3.907367) # embedded # 2.152000 0.000000 2.152000 ( 2.152226) # 2.191000 0.000000 2.191000 ( 2.191359) # 2.083000 0.000000 2.083000 ( 2.082696) |
|
|
Re: quick jruby + solr benchmarksSo about 2x? Not bad. I wonder what running httperf against a simple
app would show. On Nov 25, 2008, at 6:04 PM, Matt Mitchell wrote: > I'm starting to experiment with benchmarks/jruby + solr and just > wanted to > get this out there -- getting ready for a week vacation :) > > In my solr-ruby 'refactoring' progress, I'm finding some interesting > results > and will try to post in the next few weeks. > > This is jruby 1.1.4 and solr 1.3 (empty index) -- using the standard > Ruby > "Benchmark" library. > > The script: > > # > > require 'java' > require 'benchmark' > > solr_dist_root = File.expand_path(File.join(File.dirname(__FILE__), > '..', > 'apache-solr-1.3.0')) > solr_home = File.join(solr_dist_root, 'example', 'solr') > > def require_jars(dir) > jar_pattern = File.join(dir,"**", "*.jar") > jar_files = Dir.glob(jar_pattern) > jar_files.each {|jar_file| require jar_file} > end > > def hash_to_params(hash_params) > import org.apache.solr.common.params.ModifiableSolrParams > query = ModifiableSolrParams.new > query.instance_eval do > alias _add add > def add(field, values) > _add(field.to_s, (values.is_a?(Array) ? values : > [values]).to_java(:string)) > end > end > hash_params.each_pair do |k,v| > query.add k, v > end > query > end > > require_jars(File.join(solr_dist_root, "lib")) > require_jars(File.join(solr_dist_root, "dist")) > > # HttpCommons > def http_commons > @http_commons ||= ( > import org.apache.solr.client.solrj.impl.CommonsHttpSolrServer > import org.apache.solr.common.params.MapSolrParams > solr = CommonsHttpSolrServer.new("http://localhost:8983/solr") > ) > end > > # EmbeddedSolrServer > def embedded(solr_home) > @embedded ||= ( > import org.apache.solr.client.solrj.embedded.EmbeddedSolrServer > import org.apache.solr.core.CoreContainer > import org.apache.solr.core.CoreDescriptor > import org.apache.solr.client.solrj.SolrQuery > core_name = 'main-core' > container = CoreContainer.new > descriptor = CoreDescriptor.new(container, core_name, solr_home) > core = container.create(descriptor) > container.register(core_name, core, false) > solr = EmbeddedSolrServer.new(container, core_name) > ) > end > > query = {'qt' => 'standard', 'q'=>'ipod', 'facet.field' => 'cat'} > params = hash_to_params(query) > > max = 1000 > > Benchmark.bm do |x| > x.report 'http commons' do > max.times do > http_commons.query(params) > end > end > x.report 'embedded' do > max.times do > embedded(solr_home).query(params) > end > end > end > > # THE RESULTS > > # http commons > # 4.634000 0.000000 4.634000 ( 4.633849) > # 4.454000 0.000000 4.454000 ( 4.453764) > # 3.908000 0.000000 3.908000 ( 3.907367) > > # embedded > # 2.152000 0.000000 2.152000 ( 2.152226) > # 2.191000 0.000000 2.191000 ( 2.191359) > # 2.083000 0.000000 2.083000 ( 2.082696) |
|
|
Re: quick jruby + solr benchmarksLooks like jruby + DirectSolrConnection are on top. I'll try to get some
update queries next. 1,000 iterations VS 10,000 iterations Added Ruby MRI 1.8.6, using open-uri / http Added jruby using open-uri / http "Benchmark'" standard lib solr 1.3 empty index query = ipod # jruby + CommonsHttpSolrServer # user system total real # 1000 iterations # 4.335000 0.000000 4.335000 ( 4.334744) # 4.335000 0.000000 4.335000 ( 4.334730) # 10000 iterations # 32.355000 0.000000 32.355000 ( 32.354999) # 32.303000 0.000000 32.303000 ( 32.302859) # 32.323000 0.000000 32.323000 ( 32.323368) # jruby + EmbeddedSolrServer # user system total real # 1000 iterations # 2.268000 0.000000 2.268000 ( 2.267976) # 2.357000 0.000000 2.357000 ( 2.356588) # 10000 iterations # 10.650000 0.000000 10.650000 ( 10.649839) # 8.099000 0.000000 8.099000 ( 8.099088) # 8.119000 0.000000 8.119000 ( 8.118807) # jruby + DirectSolrConnection # user system total real # 1000 iterations # 1.593000 0.000000 1.593000 ( 1.592349) # 1.595000 0.000000 1.595000 ( 1.594842) # 10000 iterations # 10.708000 0.000000 10.708000 ( 10.707790) # 6.952000 0.000000 6.952000 ( 6.951736) # 7.939000 0.000000 7.939000 ( 7.939191) # ruby mri + http / open-uri # user system total real # 1000 iterations # 0.760000 0.310000 1.070000 ( 1.607703) # 0.730000 0.300000 1.030000 ( 1.619739) # 0.760000 0.330000 1.090000 ( 1.907517) # 0.740000 0.300000 1.040000 ( 1.543832) # 10000 iterations # 7.300000 2.970000 10.270000 ( 15.452759) # 7.290000 2.960000 10.250000 ( 15.585011) # 7.330000 2.980000 10.310000 ( 15.781377) # jruby + http / open-uri # user system total real # 10000 iterations # 27.583000 0.000000 27.583000 ( 27.582765) # 25.620000 0.000000 25.620000 ( 25.620403) # 25.474000 0.000000 25.474000 ( 25.473653) |
|
|
Re: quick jruby + solr benchmarksYeah that type of benchmark would probably be a lot more useful. I'll see if
I can get something like that going. I've never really done benchmarking before. Any general tips? matt On Tue, Nov 25, 2008 at 7:13 PM, Jamie Orchard-Hays <jamie@...>wrote: > So about 2x? Not bad. I wonder what running httperf against a simple app > would show. > > > > On Nov 25, 2008, at 6:04 PM, Matt Mitchell wrote: > > I'm starting to experiment with benchmarks/jruby + solr and just wanted to >> get this out there -- getting ready for a week vacation :) >> >> In my solr-ruby 'refactoring' progress, I'm finding some interesting >> results >> and will try to post in the next few weeks. >> >> This is jruby 1.1.4 and solr 1.3 (empty index) -- using the standard Ruby >> "Benchmark" library. >> >> The script: >> >> # >> >> require 'java' >> require 'benchmark' >> >> solr_dist_root = File.expand_path(File.join(File.dirname(__FILE__), '..', >> 'apache-solr-1.3.0')) >> solr_home = File.join(solr_dist_root, 'example', 'solr') >> >> def require_jars(dir) >> jar_pattern = File.join(dir,"**", "*.jar") >> jar_files = Dir.glob(jar_pattern) >> jar_files.each {|jar_file| require jar_file} >> end >> >> def hash_to_params(hash_params) >> import org.apache.solr.common.params.ModifiableSolrParams >> query = ModifiableSolrParams.new >> query.instance_eval do >> alias _add add >> def add(field, values) >> _add(field.to_s, (values.is_a?(Array) ? values : >> [values]).to_java(:string)) >> end >> end >> hash_params.each_pair do |k,v| >> query.add k, v >> end >> query >> end >> >> require_jars(File.join(solr_dist_root, "lib")) >> require_jars(File.join(solr_dist_root, "dist")) >> >> # HttpCommons >> def http_commons >> @http_commons ||= ( >> import org.apache.solr.client.solrj.impl.CommonsHttpSolrServer >> import org.apache.solr.common.params.MapSolrParams >> solr = CommonsHttpSolrServer.new("http://localhost:8983/solr") >> ) >> end >> >> # EmbeddedSolrServer >> def embedded(solr_home) >> @embedded ||= ( >> import org.apache.solr.client.solrj.embedded.EmbeddedSolrServer >> import org.apache.solr.core.CoreContainer >> import org.apache.solr.core.CoreDescriptor >> import org.apache.solr.client.solrj.SolrQuery >> core_name = 'main-core' >> container = CoreContainer.new >> descriptor = CoreDescriptor.new(container, core_name, solr_home) >> core = container.create(descriptor) >> container.register(core_name, core, false) >> solr = EmbeddedSolrServer.new(container, core_name) >> ) >> end >> >> query = {'qt' => 'standard', 'q'=>'ipod', 'facet.field' => 'cat'} >> params = hash_to_params(query) >> >> max = 1000 >> >> Benchmark.bm do |x| >> x.report 'http commons' do >> max.times do >> http_commons.query(params) >> end >> end >> x.report 'embedded' do >> max.times do >> embedded(solr_home).query(params) >> end >> end >> end >> >> # THE RESULTS >> >> # http commons >> # 4.634000 0.000000 4.634000 ( 4.633849) >> # 4.454000 0.000000 4.454000 ( 4.453764) >> # 3.908000 0.000000 3.908000 ( 3.907367) >> >> # embedded >> # 2.152000 0.000000 2.152000 ( 2.152226) >> # 2.191000 0.000000 2.191000 ( 2.191359) >> # 2.083000 0.000000 2.083000 ( 2.082696) >> > > |
|
|
Re: quick jruby + solr benchmarksOn Nov 25, 2008, at 7:13 PM, Jamie Orchard-Hays wrote: > So about 2x? Not bad. I wonder what running httperf against a simple > app would show. Keep in mind these points: * Solr's query cache. Repeating a query 1000 times is really only executing the query one time and pulling the document set the rest of the time, except... * Solr supports HTTP cache headers. Thus a "smart" HTTP client that is HTTP cache savvy will get 304's for 999 of those queries without Solr doing anything but checking the HTTP request headers and the current state of the index. Note that Matt's benchmark code is not HTTP cache savvy at the moment (not a flaw per se, just worth noting). Erik |
|
|
Re: quick jruby + solr benchmarksjust a couple of quick code comments...
On Nov 25, 2008, at 6:04 PM, Matt Mitchell wrote: > # EmbeddedSolrServer > def embedded(solr_home) > @embedded ||= ( > import org.apache.solr.client.solrj.embedded.EmbeddedSolrServer > import org.apache.solr.core.CoreContainer > import org.apache.solr.core.CoreDescriptor > import org.apache.solr.client.solrj.SolrQuery > core_name = 'main-core' > container = CoreContainer.new > descriptor = CoreDescriptor.new(container, core_name, solr_home) > core = container.create(descriptor) You'll want to close that core, otherwise the JVM doesn't exit. I changed this to: @core = .... > container.register(core_name, core, false) and used @core there. > query = {'qt' => 'standard', 'q'=>'ipod', 'facet.field' => 'cat'} Note that faceting is not enabled unless there is also a &facet=on > params = hash_to_params(query) > > max = 1000 > > Benchmark.bm do |x| > x.report 'http commons' do > max.times do > http_commons.query(params) > end > end > x.report 'embedded' do > max.times do > embedded(solr_home).query(params) > end > end > end And I added an: @core.close at the end. Erik |
|
|
Re: quick jruby + solr benchmarksYeah I overlooked all of that. Thanks Erik. So could a better query test be
an incremental one based on id like: 100.times do |id| q = "id:#{id}" # query request here... end ? Would you happen to know why the solr home and data dir never really change? Anytime I use commons http or embedded, a "solr" directory is created in the same directory as my script. Even though I'm setting the home and data dir in my code? Matt On Wed, Nov 26, 2008 at 3:28 AM, Erik Hatcher <erik@...>wrote: > just a couple of quick code comments... > > On Nov 25, 2008, at 6:04 PM, Matt Mitchell wrote: > >> # EmbeddedSolrServer >> def embedded(solr_home) >> @embedded ||= ( >> import org.apache.solr.client.solrj.embedded.EmbeddedSolrServer >> import org.apache.solr.core.CoreContainer >> import org.apache.solr.core.CoreDescriptor >> import org.apache.solr.client.solrj.SolrQuery >> core_name = 'main-core' >> container = CoreContainer.new >> descriptor = CoreDescriptor.new(container, core_name, solr_home) >> core = container.create(descriptor) >> > > You'll want to close that core, otherwise the JVM doesn't exit. I changed > this to: > > @core = .... > > container.register(core_name, core, false) >> > > and used @core there. > > query = {'qt' => 'standard', 'q'=>'ipod', 'facet.field' => 'cat'} >> > > Note that faceting is not enabled unless there is also a &facet=on > > params = hash_to_params(query) >> >> max = 1000 >> >> Benchmark.bm do |x| >> x.report 'http commons' do >> max.times do >> http_commons.query(params) >> end >> end >> x.report 'embedded' do >> max.times do >> embedded(solr_home).query(params) >> end >> end >> end >> > > And I added an: > > @core.close > > at the end. > > Erik > > |
|
|
Re: quick jruby + solr benchmarksOn Nov 26, 2008, at 9:54 AM, Matt Mitchell wrote:
> Yeah I overlooked all of that. Thanks Erik. So could a better query > test be > an incremental one based on id like: > > 100.times do |id| > q = "id:#{id}" > # query request here... > end > > ? Testing is an art form. Depends on what you are testing. Issuing entirely unique queries is not very real-world either, but at least it will cause the bypassing of query and HTTP caching shortcuts. Many organizations mine their query logs to get a set of representative queries to test with, for example. I think your point is proven - EmbeddedSolrServer itself is faster than CommonsHttpSolrServer. But would you deploy that way? Is your front-end going to be merged with Solr itself? That may or may not be very viable, depending on the resources the front-end and Solr needs and how much system resources you have. What about doing load balancing? You're then stuck with load balancing your front-end in tandem with Solr itself. Again, it all boils down to what you're after with the benchmarks. And I'm not a benchmarking performance savvy person myself, so I'm not sure where to take it from here. It's an interesting test, for sure, and I'd like to have it reviewed by others that really know their stuff in this realm and with Solr itself that can elaborate on why there is such a huge difference in speed. Is it just HTTP and serialize/unserialize overhead? (I tend to doubt that, but don't know) > Would you happen to know why the solr home and data dir never really > change? > Anytime I use commons http or embedded, a "solr" directory is > created in the > same directory as my script. Even though I'm setting the home and > data dir > in my code? I don't know at the moment, I'd have to dig deeper. Erik |
|
|
Re: quick jruby + solr benchmarksI just had a brief conversation with Yonik on this to get his way more
expert opinion, and it really boils down to this in this particular test... the query itself is incredibly fast (1 millisecond or less QTime Solr reports) since there are no documents. So what these differences are showing is merely the difference between HTTP and a method call - with nothing else (of note) going on. In a realer world scenario, the HTTP overhead makes less difference as the work being done in the query/faceting overshadows the communication overhead. There's lies, damned lies, and benchmarks :) Erik On Nov 26, 2008, at 9:54 AM, Matt Mitchell wrote: > Yeah I overlooked all of that. Thanks Erik. So could a better query > test be > an incremental one based on id like: > > 100.times do |id| > q = "id:#{id}" > # query request here... > end > > ? > > Would you happen to know why the solr home and data dir never really > change? > Anytime I use commons http or embedded, a "solr" directory is > created in the > same directory as my script. Even though I'm setting the home and > data dir > in my code? > > Matt > > On Wed, Nov 26, 2008 at 3:28 AM, Erik Hatcher <erik@... > >wrote: > >> just a couple of quick code comments... >> >> On Nov 25, 2008, at 6:04 PM, Matt Mitchell wrote: >> >>> # EmbeddedSolrServer >>> def embedded(solr_home) >>> @embedded ||= ( >>> import org.apache.solr.client.solrj.embedded.EmbeddedSolrServer >>> import org.apache.solr.core.CoreContainer >>> import org.apache.solr.core.CoreDescriptor >>> import org.apache.solr.client.solrj.SolrQuery >>> core_name = 'main-core' >>> container = CoreContainer.new >>> descriptor = CoreDescriptor.new(container, core_name, solr_home) >>> core = container.create(descriptor) >>> >> >> You'll want to close that core, otherwise the JVM doesn't exit. I >> changed >> this to: >> >> @core = .... >> >> container.register(core_name, core, false) >>> >> >> and used @core there. >> >> query = {'qt' => 'standard', 'q'=>'ipod', 'facet.field' => 'cat'} >>> >> >> Note that faceting is not enabled unless there is also a &facet=on >> >> params = hash_to_params(query) >>> >>> max = 1000 >>> >>> Benchmark.bm do |x| >>> x.report 'http commons' do >>> max.times do >>> http_commons.query(params) >>> end >>> end >>> x.report 'embedded' do >>> max.times do >>> embedded(solr_home).query(params) >>> end >>> end >>> end >>> >> >> And I added an: >> >> @core.close >> >> at the end. >> >> Erik >> >> |
|
|
Re: quick jruby + solr benchmarksInteresting. My main goal was to get a feel for how jruby and the
direct/embedded stuff compared to mri ruby and straight up http. But obviously, the data and these tests are not realistic at all. Thanks for your feedback guys. Matt On Wed, Nov 26, 2008 at 10:34 AM, Erik Hatcher <erik@...>wrote: > I just had a brief conversation with Yonik on this to get his way more > expert opinion, and it really boils down to this in this particular test... > the query itself is incredibly fast (1 millisecond or less QTime Solr > reports) since there are no documents. So what these differences are > showing is merely the difference between HTTP and a method call - with > nothing else (of note) going on. > > In a realer world scenario, the HTTP overhead makes less difference as the > work being done in the query/faceting overshadows the communication > overhead. > > There's lies, damned lies, and benchmarks :) > > Erik > > > > On Nov 26, 2008, at 9:54 AM, Matt Mitchell wrote: > > Yeah I overlooked all of that. Thanks Erik. So could a better query test >> be >> an incremental one based on id like: >> >> 100.times do |id| >> q = "id:#{id}" >> # query request here... >> end >> >> ? >> >> Would you happen to know why the solr home and data dir never really >> change? >> Anytime I use commons http or embedded, a "solr" directory is created in >> the >> same directory as my script. Even though I'm setting the home and data dir >> in my code? >> >> Matt >> >> On Wed, Nov 26, 2008 at 3:28 AM, Erik Hatcher <erik@... >> >wrote: >> >> just a couple of quick code comments... >>> >>> On Nov 25, 2008, at 6:04 PM, Matt Mitchell wrote: >>> >>> # EmbeddedSolrServer >>>> def embedded(solr_home) >>>> @embedded ||= ( >>>> import org.apache.solr.client.solrj.embedded.EmbeddedSolrServer >>>> import org.apache.solr.core.CoreContainer >>>> import org.apache.solr.core.CoreDescriptor >>>> import org.apache.solr.client.solrj.SolrQuery >>>> core_name = 'main-core' >>>> container = CoreContainer.new >>>> descriptor = CoreDescriptor.new(container, core_name, solr_home) >>>> core = container.create(descriptor) >>>> >>>> >>> You'll want to close that core, otherwise the JVM doesn't exit. I >>> changed >>> this to: >>> >>> @core = .... >>> >>> container.register(core_name, core, false) >>> >>>> >>>> >>> and used @core there. >>> >>> query = {'qt' => 'standard', 'q'=>'ipod', 'facet.field' => 'cat'} >>> >>>> >>>> >>> Note that faceting is not enabled unless there is also a &facet=on >>> >>> params = hash_to_params(query) >>> >>>> >>>> max = 1000 >>>> >>>> Benchmark.bm do |x| >>>> x.report 'http commons' do >>>> max.times do >>>> http_commons.query(params) >>>> end >>>> end >>>> x.report 'embedded' do >>>> max.times do >>>> embedded(solr_home).query(params) >>>> end >>>> end >>>> end >>>> >>>> >>> And I added an: >>> >>> @core.close >>> >>> at the end. >>> >>> Erik >>> >>> >>> > |
|
|
Re: quick jruby + solr benchmarksHere's something to note when using net/http in Ruby (which open-uri
wraps). Even though it's about as fast as other options, it uses a huge cpu load when compared to others (on ruby 1.8.6): http://apocryph.org/more_indepth_analysis_ruby_http_client_performance On Nov 26, 2008, at 12:06 PM, Matt Mitchell wrote: > Interesting. My main goal was to get a feel for how jruby and the > direct/embedded stuff compared to mri ruby and straight up http. But > obviously, the data and these tests are not realistic at all. Thanks > for > your feedback guys. > > Matt > > On Wed, Nov 26, 2008 at 10:34 AM, Erik Hatcher > <erik@...>wrote: > >> I just had a brief conversation with Yonik on this to get his way >> more >> expert opinion, and it really boils down to this in this particular >> test... >> the query itself is incredibly fast (1 millisecond or less QTime Solr >> reports) since there are no documents. So what these differences are >> showing is merely the difference between HTTP and a method call - >> with >> nothing else (of note) going on. >> >> In a realer world scenario, the HTTP overhead makes less difference >> as the >> work being done in the query/faceting overshadows the communication >> overhead. >> >> There's lies, damned lies, and benchmarks :) >> >> Erik >> >> >> >> On Nov 26, 2008, at 9:54 AM, Matt Mitchell wrote: >> >> Yeah I overlooked all of that. Thanks Erik. So could a better query >> test >>> be >>> an incremental one based on id like: >>> >>> 100.times do |id| >>> q = "id:#{id}" >>> # query request here... >>> end >>> >>> ? >>> >>> Would you happen to know why the solr home and data dir never really >>> change? >>> Anytime I use commons http or embedded, a "solr" directory is >>> created in >>> the >>> same directory as my script. Even though I'm setting the home and >>> data dir >>> in my code? >>> >>> Matt >>> >>> On Wed, Nov 26, 2008 at 3:28 AM, Erik Hatcher <erik@... >>>> wrote: >>> >>> just a couple of quick code comments... >>>> >>>> On Nov 25, 2008, at 6:04 PM, Matt Mitchell wrote: >>>> >>>> # EmbeddedSolrServer >>>>> def embedded(solr_home) >>>>> @embedded ||= ( >>>>> import org.apache.solr.client.solrj.embedded.EmbeddedSolrServer >>>>> import org.apache.solr.core.CoreContainer >>>>> import org.apache.solr.core.CoreDescriptor >>>>> import org.apache.solr.client.solrj.SolrQuery >>>>> core_name = 'main-core' >>>>> container = CoreContainer.new >>>>> descriptor = CoreDescriptor.new(container, core_name, solr_home) >>>>> core = container.create(descriptor) >>>>> >>>>> >>>> You'll want to close that core, otherwise the JVM doesn't exit. I >>>> changed >>>> this to: >>>> >>>> @core = .... >>>> >>>> container.register(core_name, core, false) >>>> >>>>> >>>>> >>>> and used @core there. >>>> >>>> query = {'qt' => 'standard', 'q'=>'ipod', 'facet.field' => 'cat'} >>>> >>>>> >>>>> >>>> Note that faceting is not enabled unless there is also a &facet=on >>>> >>>> params = hash_to_params(query) >>>> >>>>> >>>>> max = 1000 >>>>> >>>>> Benchmark.bm do |x| >>>>> x.report 'http commons' do >>>>> max.times do >>>>> http_commons.query(params) >>>>> end >>>>> end >>>>> x.report 'embedded' do >>>>> max.times do >>>>> embedded(solr_home).query(params) >>>>> end >>>>> end >>>>> end >>>>> >>>>> >>>> And I added an: >>>> >>>> @core.close >>>> >>>> at the end. >>>> >>>> Erik >>>> >>>> >>>> >> |
|
|
Re: quick jruby + solr benchmarksThanks Jamie. That's kind of shocking actually. What client library do you
use? On Sun, Nov 30, 2008 at 1:38 PM, Jamie Orchard-Hays <jamie@...>wrote: > Here's something to note when using net/http in Ruby (which open-uri > wraps). Even though it's about as fast as other options, it uses a huge cpu > load when compared to others (on ruby 1.8.6): > > http://apocryph.org/more_indepth_analysis_ruby_http_client_performance > > > > On Nov 26, 2008, at 12:06 PM, Matt Mitchell wrote: > > Interesting. My main goal was to get a feel for how jruby and the >> direct/embedded stuff compared to mri ruby and straight up http. But >> obviously, the data and these tests are not realistic at all. Thanks for >> your feedback guys. >> >> Matt >> >> On Wed, Nov 26, 2008 at 10:34 AM, Erik Hatcher >> <erik@...>wrote: >> >> I just had a brief conversation with Yonik on this to get his way more >>> expert opinion, and it really boils down to this in this particular >>> test... >>> the query itself is incredibly fast (1 millisecond or less QTime Solr >>> reports) since there are no documents. So what these differences are >>> showing is merely the difference between HTTP and a method call - with >>> nothing else (of note) going on. >>> >>> In a realer world scenario, the HTTP overhead makes less difference as >>> the >>> work being done in the query/faceting overshadows the communication >>> overhead. >>> >>> There's lies, damned lies, and benchmarks :) >>> >>> Erik >>> >>> >>> >>> On Nov 26, 2008, at 9:54 AM, Matt Mitchell wrote: >>> >>> Yeah I overlooked all of that. Thanks Erik. So could a better query test >>> >>>> be >>>> an incremental one based on id like: >>>> >>>> 100.times do |id| >>>> q = "id:#{id}" >>>> # query request here... >>>> end >>>> >>>> ? >>>> >>>> Would you happen to know why the solr home and data dir never really >>>> change? >>>> Anytime I use commons http or embedded, a "solr" directory is created in >>>> the >>>> same directory as my script. Even though I'm setting the home and data >>>> dir >>>> in my code? >>>> >>>> Matt >>>> >>>> On Wed, Nov 26, 2008 at 3:28 AM, Erik Hatcher < >>>> erik@... >>>> >>>>> wrote: >>>>> >>>> >>>> just a couple of quick code comments... >>>> >>>>> >>>>> On Nov 25, 2008, at 6:04 PM, Matt Mitchell wrote: >>>>> >>>>> # EmbeddedSolrServer >>>>> >>>>>> def embedded(solr_home) >>>>>> @embedded ||= ( >>>>>> import org.apache.solr.client.solrj.embedded.EmbeddedSolrServer >>>>>> import org.apache.solr.core.CoreContainer >>>>>> import org.apache.solr.core.CoreDescriptor >>>>>> import org.apache.solr.client.solrj.SolrQuery >>>>>> core_name = 'main-core' >>>>>> container = CoreContainer.new >>>>>> descriptor = CoreDescriptor.new(container, core_name, solr_home) >>>>>> core = container.create(descriptor) >>>>>> >>>>>> >>>>>> You'll want to close that core, otherwise the JVM doesn't exit. I >>>>> changed >>>>> this to: >>>>> >>>>> @core = .... >>>>> >>>>> container.register(core_name, core, false) >>>>> >>>>> >>>>>> >>>>>> and used @core there. >>>>> >>>>> query = {'qt' => 'standard', 'q'=>'ipod', 'facet.field' => 'cat'} >>>>> >>>>> >>>>>> >>>>>> Note that faceting is not enabled unless there is also a &facet=on >>>>> >>>>> params = hash_to_params(query) >>>>> >>>>> >>>>>> max = 1000 >>>>>> >>>>>> Benchmark.bm do |x| >>>>>> x.report 'http commons' do >>>>>> max.times do >>>>>> http_commons.query(params) >>>>>> end >>>>>> end >>>>>> x.report 'embedded' do >>>>>> max.times do >>>>>> embedded(solr_home).query(params) >>>>>> end >>>>>> end >>>>>> end >>>>>> >>>>>> >>>>>> And I added an: >>>>> >>>>> @core.close >>>>> >>>>> at the end. >>>>> >>>>> Erik >>>>> >>>>> >>>>> >>>>> >>> > |
|
|
Re: quick jruby + solr benchmarksThe other night I spent a few hours messing with EventMachine, Curb
(libcurl ruby lib) and RFuzz. EventMachine's HTTP2 is just missing some of the POST features I need, and I didn't want to figure out how to build what I needed from EventMachine's low-level features. RFuzz works, but then would crap out completely or go from well under a second to 20+ seconds to complete a request. I suspect it's not designed for the large POSTs I need. Curb (which is used with "require 'curl'"--why do some gem authors not name the gem and the library the same dang thing???) works great. It's not any faster than net/http, but judging from those tests, I should be saving a lot of CPU. Jamie On Dec 3, 2008, at 10:05 AM, Matt Mitchell wrote: > Thanks Jamie. That's kind of shocking actually. What client library > do you > use? > > On Sun, Nov 30, 2008 at 1:38 PM, Jamie Orchard-Hays <jamie@... > >wrote: > >> Here's something to note when using net/http in Ruby (which open-uri >> wraps). Even though it's about as fast as other options, it uses a >> huge cpu >> load when compared to others (on ruby 1.8.6): >> >> http://apocryph.org/ >> more_indepth_analysis_ruby_http_client_performance >> >> >> >> On Nov 26, 2008, at 12:06 PM, Matt Mitchell wrote: >> >> Interesting. My main goal was to get a feel for how jruby and the >>> direct/embedded stuff compared to mri ruby and straight up http. But >>> obviously, the data and these tests are not realistic at all. >>> Thanks for >>> your feedback guys. >>> >>> Matt >>> >>> On Wed, Nov 26, 2008 at 10:34 AM, Erik Hatcher >>> <erik@...>wrote: >>> >>> I just had a brief conversation with Yonik on this to get his way >>> more >>>> expert opinion, and it really boils down to this in this particular >>>> test... >>>> the query itself is incredibly fast (1 millisecond or less QTime >>>> Solr >>>> reports) since there are no documents. So what these differences >>>> are >>>> showing is merely the difference between HTTP and a method call - >>>> with >>>> nothing else (of note) going on. >>>> >>>> In a realer world scenario, the HTTP overhead makes less >>>> difference as >>>> the >>>> work being done in the query/faceting overshadows the communication >>>> overhead. >>>> >>>> There's lies, damned lies, and benchmarks :) >>>> >>>> Erik >>>> >>>> >>>> >>>> On Nov 26, 2008, at 9:54 AM, Matt Mitchell wrote: >>>> >>>> Yeah I overlooked all of that. Thanks Erik. So could a better >>>> query test >>>> >>>>> be >>>>> an incremental one based on id like: >>>>> >>>>> 100.times do |id| >>>>> q = "id:#{id}" >>>>> # query request here... >>>>> end >>>>> >>>>> ? >>>>> >>>>> Would you happen to know why the solr home and data dir never >>>>> really >>>>> change? >>>>> Anytime I use commons http or embedded, a "solr" directory is >>>>> created in >>>>> the >>>>> same directory as my script. Even though I'm setting the home >>>>> and data >>>>> dir >>>>> in my code? >>>>> >>>>> Matt >>>>> >>>>> On Wed, Nov 26, 2008 at 3:28 AM, Erik Hatcher < >>>>> erik@... >>>>> >>>>>> wrote: >>>>>> >>>>> >>>>> just a couple of quick code comments... >>>>> >>>>>> >>>>>> On Nov 25, 2008, at 6:04 PM, Matt Mitchell wrote: >>>>>> >>>>>> # EmbeddedSolrServer >>>>>> >>>>>>> def embedded(solr_home) >>>>>>> @embedded ||= ( >>>>>>> import org.apache.solr.client.solrj.embedded.EmbeddedSolrServer >>>>>>> import org.apache.solr.core.CoreContainer >>>>>>> import org.apache.solr.core.CoreDescriptor >>>>>>> import org.apache.solr.client.solrj.SolrQuery >>>>>>> core_name = 'main-core' >>>>>>> container = CoreContainer.new >>>>>>> descriptor = CoreDescriptor.new(container, core_name, solr_home) >>>>>>> core = container.create(descriptor) >>>>>>> >>>>>>> >>>>>>> You'll want to close that core, otherwise the JVM doesn't >>>>>>> exit. I >>>>>> changed >>>>>> this to: >>>>>> >>>>>> @core = .... >>>>>> >>>>>> container.register(core_name, core, false) >>>>>> >>>>>> >>>>>>> >>>>>>> and used @core there. >>>>>> >>>>>> query = {'qt' => 'standard', 'q'=>'ipod', 'facet.field' => 'cat'} >>>>>> >>>>>> >>>>>>> >>>>>>> Note that faceting is not enabled unless there is also a >>>>>>> &facet=on >>>>>> >>>>>> params = hash_to_params(query) >>>>>> >>>>>> >>>>>>> max = 1000 >>>>>>> >>>>>>> Benchmark.bm do |x| >>>>>>> x.report 'http commons' do >>>>>>> max.times do >>>>>>> http_commons.query(params) >>>>>>> end >>>>>>> end >>>>>>> x.report 'embedded' do >>>>>>> max.times do >>>>>>> embedded(solr_home).query(params) >>>>>>> end >>>>>>> end >>>>>>> end >>>>>>> >>>>>>> >>>>>>> And I added an: >>>>>> >>>>>> @core.close >>>>>> >>>>>> at the end. >>>>>> >>>>>> Erik >>>>>> >>>>>> >>>>>> >>>>>> >>>> >> |
| Free embeddable forum powered by Nabble | Forum Help |