ZEO and relstporage performance

View: New views
10 Messages — Rating Filter:   Alert me  

ZEO and relstporage performance

by Jim Fulton :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I've been working on a project to speed up ZEO.  The speedup mainly
involves getting ZEO to use more threads by giving each client it's
own thread, and changing FileStorage to allow multiple simultaneous
readers.  This is especially valuable for us (ZC) for large databases
(~1TB) running on multi-splindle storage systems on which multiple
reads of the same file can take place in parallel.  I'll have more to
say about this work in later posts.

In the course of working on this, I decided to play with Shane's
relstorage benchmark, speedtest.  After playing with it a bit, I have
a few observations.

- Up to a point, it does a good job of isolating just the networking
  aspects of the mysql and ZEO protocols:

  - It uses a small enough data set to fit in ram, so the read portion
    of the tests does no disk IO.

  - It doesn't leverage ZODB or ZEO caches at all. (Although ZEO read
    times are penalized by the time taken to write to the ZEO cache
    locally.)

- The tests run clients and servers on the same machine using Unix
  Domain Sockets for communication (at least for ZEO and MySQL).
  Generally, at least in deployments we do, the clients and servers
  run on different machines.

- When running at high concurrency levels, the clients and server can
  compete for CPU recourses, distorting results.  This wouldn't happen
  of the clients ran on separate machines.

- Minor nit: the tests notion of object's per transaction is off. The
  actual number reported is on the order of 1/30 of the numbers the
  numbers reported by the tests.

I decided to explore this a bit.  I modified shanes speedtest script
on a branch:

- Added command line options to control a number of factors, like
  object sizes and concurrency levels.

- Added options to specify mysql connection parameters.  Among other
  things, this lets me run the test in a "remote" configuration, in
  which the client and server are on different machines.

- Added an option to specify a ZEO TCP address and to manage a ZEO
  server externally.

- Replaced the single read measurement with "cold", "hot" and "steamin"
  measurements. The "cold" number is what Shane's test originally called
  "read".  It reads data from the server without benefit of the ZODB
  or ZEO caches.

  The "hot" number provides timings for a second round of reads
  after minimizing the object cache.

  The "steamin" number is the timing of a 3rd round of reads without
  clearing the ZODB cache. I upped the size of the ZODB cache to make
  sure the objects woould fit.

Here are some results.  I'm going to provide them in tabular form, as
I actually find this easier than charts for this data and also because
it's less work. :) The results below are basically as output by his
script with my modifications.

First, here are results from running clients and server on the same
machine using unix domain sockets.  The results are grouped onto 3
tables based on objects per transaction.  Note that for the second and
third tables I've added the actual object counts. The machine these
were run on was a 2.2Ghz Intel Core 2 Duo (two core) desktop with a
SATA disk and 4GB of ram and running Ubuntu 9.04.  They used
relstorage trunk as of October 5, when I made by branch and using ZODB
3.9.1.  The results also reflect the default relstorage poll interval
of 0.  More on that later.  The results also reflect mysql
configured to improve write performance as described here:
http://shane.willowrise.com/archives/how-to-fix-the-mysql-write-speed/.

The first column is the concurrency level, which is the number of
simultaneous clients.  The remaining columns are in 2 groups of 4, for
ZEO and for MySQLAdapter (reslstorage+mysql).  Each group has a write
time, a cold read time, a hot read time (second set of reads after
clearing the ZODB objects cache) and a steamin time based on a 3rd set
of reads without clearing the object cache.


Columns:
"Concurrency",
 ZEO + FileStorage - write,
 ZEO + FileStorage - cold,
 ZEO + FileStorage - hot,
 ZEO + FileStorage - steamin,
 MySQLAdapter - write,
 MySQLAdapter - cold,
 MySQLAdapter - hot,
 MySQLAdapter - steamin


Local clients, poll interval 0
==============================

** Results with objects_per_txn=1 **
   ZEO+FS --------------------------   MySQL-----------------------------
   write    cold     hot      steamin  write    cold     hot      steamin
1, 0.00992, 0.00108, 0.00015, 0.00007, 0.00405, 0.00129, 0.00076, 0.00043
2, 0.01359, 0.00177, 0.00024, 0.00011, 0.00635, 0.00083, 0.00043, 0.00024
4, 0.02322, 0.00226, 0.00025, 0.00011, 0.00836, 0.00128, 0.00047, 0.00025
8, 0.07687, 0.00183, 0.00020, 0.00009, 0.01236, 0.00121, 0.00055, 0.00036
16, 0.25414, 0.00259, 0.00018, 0.00007, 0.02846, 0.00130, 0.00056, 0.00032

** Results with objects_per_txn=100 (REALLY 4) **
   ZEO+FS --------------------------   MySQL-----------------------------
   write    cold     hot      steamin  write    cold     hot      steamin
1, 0.01352, 0.00574, 0.00062, 0.00017, 0.00841, 0.00273, 0.00159, 0.00043
2, 0.02414, 0.00539, 0.00035, 0.00008, 0.00678, 0.00292, 0.00202, 0.00045
4, 0.03136, 0.00789, 0.00035, 0.00007, 0.01343, 0.00198, 0.00108, 0.00025
8, 0.09697, 0.00694, 0.00036, 0.00008, 0.01910, 0.00253, 0.00111, 0.00025
16, 0.24361, 0.01369, 0.00037, 0.00008, 0.03413, 0.00363, 0.00158, 0.00036

** Results with objects_per_txn=10000 (REALLY 334) **
   ZEO+FS --------------------------   MySQL-----------------------------
   write    cold     hot      steamin  write    cold     hot      steamin
1, 0.13877, 0.40306, 0.02324, 0.00042, 0.11370, 0.09461, 0.05026, 0.00063
2, 0.18004, 0.39529, 0.02051, 0.00045, 0.12573, 0.10313, 0.07746, 0.00072
4, 0.36065, 0.38792, 0.02192, 0.00050, 0.25860, 0.21972, 0.14529, 0.00150
8, 0.68353, 1.57573, 0.02679, 0.00110, 0.51280, 0.44516, 0.45004, 0.00126
16, 1.46470, 3.40687, 0.03225, 0.00057, 1.00606, 1.03924, 1.29605, 0.00102

As you can see, write and cold read times are quite a bit higher for
ZEO, although write times get closer together as transaction size and
concurrency increases.

Also note that the hot times are much lower for ZEO than with MySQLAdapter.
Our ZEO cache hit rates are typically around 90%.  With a cache hot
rate of only 75% I'd expect ZEO+FS to generally outperform MySQLAdapter.

The steamin times are also quite a bit lower for ZEO+FS that for
mysql.  This is a it surprising since data are simply being read from
the ZODB object cache, but the overhead of polling for changes slows
down these accesses.  Ideally, ZEO OBject cache hit rates are high, so
the steamin times are highly relevent to actual application
performance.

I shared this data with Shane who suggested running with a poll
interval of 2.  Here are the results with a poll interval of 2.

Local clients, poll interval 2
==============================

** Results with objects_per_txn=1 **
1, 0.00920, 0.00163, 0.00024, 0.00011, 0.00419, 0.00102, 0.00050, 0.00015
2, 0.01381, 0.00143, 0.00021, 0.00010, 0.00425, 0.00110, 0.00057, 0.00015
4, 0.03010, 0.00153, 0.00015, 0.00007, 0.00505, 0.00123, 0.00051, 0.00013
8, 0.06913, 0.00145, 0.00017, 0.00008, 0.01171, 0.00127, 0.00038, 0.00008
16, 0.21394, 0.00308, 0.00017, 0.00007, 0.02466, 0.00225, 0.00037, 0.00008

** Results with objects_per_txn=100 (REALLY 4) **
1, 0.01582, 0.00571, 0.00066, 0.00013, 0.00532, 0.00249, 0.00131, 0.00015
2, 0.01774, 0.00612, 0.00062, 0.00013, 0.00704, 0.00244, 0.00098, 0.00009
4, 0.02779, 0.00710, 0.00055, 0.00012, 0.00741, 0.00384, 0.00143, 0.00009
8, 0.08021, 0.01067, 0.00035, 0.00007, 0.01639, 0.00323, 0.00100, 0.00009
16, 0.26911, 0.01602, 0.00038, 0.00007, 0.03164, 0.00462, 0.00101, 0.00009

** Results with objects_per_txn=10000 (REALLY 334) **
1, 0.16153, 0.40147, 0.02417, 0.00042, 0.11959, 0.10012, 0.05048, 0.00045
2, 0.18652, 0.39361, 0.02055, 0.00044, 0.12947, 0.10604, 0.08080, 0.00047
4, 0.33065, 0.84091, 0.02331, 0.00050, 0.25859, 0.21675, 0.13139, 0.00052
8, 0.67337, 1.46541, 0.02905, 0.00069, 0.49674, 0.42905, 0.44064, 0.00063
16, 1.46586, 3.67101, 0.03427, 0.00097, 0.99446, 1.06484, 1.16689, 0.00078

Here the steamin times are are very similar for ZEO and MySQLAdapter,
although the ZEO+FS times are a bit lower.  Note however, that using a
poll interval of 2 may cause excessive conflict errors, especially if
there are relatively hot objects that get updated a lot.

In our deployments, the clients are on separate machines and generally
don't compete with each other or with each other for CPU resources.
The tables blow show results with clients running on a separate 8-core
2.33Ghz Xeon (dual quad core) machine with 24G of memory and running
Centos 4.7.  There was plenty of CPU resources for the clients so they
never came close to using all of the available CPU resources.

Remote clients, poll interval 2
==============================

** Results with objects_per_txn=1 **
1, 0.03733, 0.00207, 0.00015, 0.00007, 0.01905, 0.00240, 0.00141, 0.00008
2, 0.01772, 0.00233, 0.00015, 0.00007, 0.01962, 0.00240, 0.00147, 0.00008
4, 0.06634, 0.00236, 0.00015, 0.00007, 0.03471, 0.00262, 0.00162, 0.00008
8, 0.08080, 0.00364, 0.00016, 0.00007, 0.06410, 0.00287, 0.00164, 0.00008
16, 0.09270, 0.00440, 0.00016, 0.00007, 0.13171, 0.00316, 0.00174, 0.00009

** Results with objects_per_txn=100 (REALLY 4) **
1, 0.01809, 0.00683, 0.00034, 0.00007, 0.02432, 0.00597, 0.00480, 0.00008
2, 0.02210, 0.00816, 0.00034, 0.00007, 0.02873, 0.00645, 0.00513, 0.00008
4, 0.07079, 0.00991, 0.00036, 0.00007, 0.03521, 0.00655, 0.00520, 0.00009
8, 0.08739, 0.01388, 0.00035, 0.00007, 0.06754, 0.00706, 0.00557, 0.00009
16, 0.09264, 0.01376, 0.00035, 0.00007, 0.13904, 0.00777, 0.00593, 0.00010

** Results with objects_per_txn=10000 (REALLY 334) **
1, 0.17738, 0.57640, 0.01969, 0.00038, 0.61835, 0.47054, 0.39015, 0.00041
2, 0.20881, 0.67896, 0.01973, 0.00038, 0.65081, 0.45832, 0.39691, 0.00043
4, 0.28996, 0.92163, 0.01993, 0.00038, 0.70280, 0.47962, 0.41136, 0.00044
8, 0.41571, 1.25167, 0.02008, 0.00040, 0.81672, 0.50079, 0.50144, 0.00045
16, 0.60316, 1.54352, 0.02033, 0.00039, 1.23906, 0.60130, 0.68200, 0.00049


Some things to note:

- For smaller transaction sizes, ZEO+FS and MySQLAdapter write times
  are pretty close, however at higher levels of concurrency or for
  large transaction sizes, ZEO+FS outperforms MySQLAdapter on writes.

- For smaller transaction sizes, ZEO+FS and MySQLAdapter cold read
  times are pretty close. Even for larger transaction sizes, the cold
  read times are pretty close, except at the highest concurrency
  level.  I think what's happening for high concurrency and large
  transaction sizes is that ZEO has reached maximum throughput and the
  MySQLAdapter still has some breathing room.

- The hot times are more than an order of magnitude better for
  ZEO+FS.

These benchmarks make ZEO+FS look pretty good relative to
MySQLAdapter.  The overall performance assuming even moderate;y
effective ZEO pr object caches is significantly better for ZEO.
Keep in mind, however, that these benchmarks don't take
disk access on the server into account for reads, because there isn't
any.  In practice, I'd expect server disk access times to dominate
cold read times.  For example, in a separate benchmark with far more
realistic access patterns against a large database, object load times
are an order of machnitude greater than what you'd see if the data
being read was all in RAM.

Jim

--
Jim Fulton
_______________________________________________
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@...
https://mail.zope.org/mailman/listinfo/zodb-dev

Re: ZEO and relstporage performance

by Ross J. Reedstrom :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Very interesting. I wonder how the postgresql version fairs?

Ross


On Tue, Oct 13, 2009 at 05:08:07PM -0400, Jim Fulton wrote:

> I've been working on a project to speed up ZEO.  The speedup mainly
> involves getting ZEO to use more threads by giving each client it's
> own thread, and changing FileStorage to allow multiple simultaneous
> readers.  This is especially valuable for us (ZC) for large databases
> (~1TB) running on multi-splindle storage systems on which multiple
> reads of the same file can take place in parallel.  I'll have more to
> say about this work in later posts.
>
> In the course of working on this, I decided to play with Shane's
> relstorage benchmark, speedtest.  After playing with it a bit, I have
> a few observations.
>
> - Up to a point, it does a good job of isolating just the networking
>   aspects of the mysql and ZEO protocols:
>
>   - It uses a small enough data set to fit in ram, so the read portion
>     of the tests does no disk IO.
>
>   - It doesn't leverage ZODB or ZEO caches at all. (Although ZEO read
>     times are penalized by the time taken to write to the ZEO cache
>     locally.)
>
> - The tests run clients and servers on the same machine using Unix
>   Domain Sockets for communication (at least for ZEO and MySQL).
>   Generally, at least in deployments we do, the clients and servers
>   run on different machines.
>
> - When running at high concurrency levels, the clients and server can
>   compete for CPU recourses, distorting results.  This wouldn't happen
>   of the clients ran on separate machines.
>
> - Minor nit: the tests notion of object's per transaction is off. The
>   actual number reported is on the order of 1/30 of the numbers the
>   numbers reported by the tests.
>
> I decided to explore this a bit.  I modified shanes speedtest script
> on a branch:
>
> - Added command line options to control a number of factors, like
>   object sizes and concurrency levels.
>
> - Added options to specify mysql connection parameters.  Among other
>   things, this lets me run the test in a "remote" configuration, in
>   which the client and server are on different machines.
>
> - Added an option to specify a ZEO TCP address and to manage a ZEO
>   server externally.
>
> - Replaced the single read measurement with "cold", "hot" and "steamin"
>   measurements. The "cold" number is what Shane's test originally called
>   "read".  It reads data from the server without benefit of the ZODB
>   or ZEO caches.
>
>   The "hot" number provides timings for a second round of reads
>   after minimizing the object cache.
>
>   The "steamin" number is the timing of a 3rd round of reads without
>   clearing the ZODB cache. I upped the size of the ZODB cache to make
>   sure the objects woould fit.
>
> Here are some results.  I'm going to provide them in tabular form, as
> I actually find this easier than charts for this data and also because
> it's less work. :) The results below are basically as output by his
> script with my modifications.
>
> First, here are results from running clients and server on the same
> machine using unix domain sockets.  The results are grouped onto 3
> tables based on objects per transaction.  Note that for the second and
> third tables I've added the actual object counts. The machine these
> were run on was a 2.2Ghz Intel Core 2 Duo (two core) desktop with a
> SATA disk and 4GB of ram and running Ubuntu 9.04.  They used
> relstorage trunk as of October 5, when I made by branch and using ZODB
> 3.9.1.  The results also reflect the default relstorage poll interval
> of 0.  More on that later.  The results also reflect mysql
> configured to improve write performance as described here:
> http://shane.willowrise.com/archives/how-to-fix-the-mysql-write-speed/.
>
> The first column is the concurrency level, which is the number of
> simultaneous clients.  The remaining columns are in 2 groups of 4, for
> ZEO and for MySQLAdapter (reslstorage+mysql).  Each group has a write
> time, a cold read time, a hot read time (second set of reads after
> clearing the ZODB objects cache) and a steamin time based on a 3rd set
> of reads without clearing the object cache.
>
>
> Columns:
> "Concurrency",
>  ZEO + FileStorage - write,
>  ZEO + FileStorage - cold,
>  ZEO + FileStorage - hot,
>  ZEO + FileStorage - steamin,
>  MySQLAdapter - write,
>  MySQLAdapter - cold,
>  MySQLAdapter - hot,
>  MySQLAdapter - steamin
>
>
> Local clients, poll interval 0
> ==============================
>
> ** Results with objects_per_txn=1 **
>    ZEO+FS --------------------------   MySQL-----------------------------
>    write    cold     hot      steamin  write    cold     hot      steamin
> 1, 0.00992, 0.00108, 0.00015, 0.00007, 0.00405, 0.00129, 0.00076, 0.00043
> 2, 0.01359, 0.00177, 0.00024, 0.00011, 0.00635, 0.00083, 0.00043, 0.00024
> 4, 0.02322, 0.00226, 0.00025, 0.00011, 0.00836, 0.00128, 0.00047, 0.00025
> 8, 0.07687, 0.00183, 0.00020, 0.00009, 0.01236, 0.00121, 0.00055, 0.00036
> 16, 0.25414, 0.00259, 0.00018, 0.00007, 0.02846, 0.00130, 0.00056, 0.00032
>
> ** Results with objects_per_txn=100 (REALLY 4) **
>    ZEO+FS --------------------------   MySQL-----------------------------
>    write    cold     hot      steamin  write    cold     hot      steamin
> 1, 0.01352, 0.00574, 0.00062, 0.00017, 0.00841, 0.00273, 0.00159, 0.00043
> 2, 0.02414, 0.00539, 0.00035, 0.00008, 0.00678, 0.00292, 0.00202, 0.00045
> 4, 0.03136, 0.00789, 0.00035, 0.00007, 0.01343, 0.00198, 0.00108, 0.00025
> 8, 0.09697, 0.00694, 0.00036, 0.00008, 0.01910, 0.00253, 0.00111, 0.00025
> 16, 0.24361, 0.01369, 0.00037, 0.00008, 0.03413, 0.00363, 0.00158, 0.00036
>
> ** Results with objects_per_txn=10000 (REALLY 334) **
>    ZEO+FS --------------------------   MySQL-----------------------------
>    write    cold     hot      steamin  write    cold     hot      steamin
> 1, 0.13877, 0.40306, 0.02324, 0.00042, 0.11370, 0.09461, 0.05026, 0.00063
> 2, 0.18004, 0.39529, 0.02051, 0.00045, 0.12573, 0.10313, 0.07746, 0.00072
> 4, 0.36065, 0.38792, 0.02192, 0.00050, 0.25860, 0.21972, 0.14529, 0.00150
> 8, 0.68353, 1.57573, 0.02679, 0.00110, 0.51280, 0.44516, 0.45004, 0.00126
> 16, 1.46470, 3.40687, 0.03225, 0.00057, 1.00606, 1.03924, 1.29605, 0.00102
>
> As you can see, write and cold read times are quite a bit higher for
> ZEO, although write times get closer together as transaction size and
> concurrency increases.
>
> Also note that the hot times are much lower for ZEO than with MySQLAdapter.
> Our ZEO cache hit rates are typically around 90%.  With a cache hot
> rate of only 75% I'd expect ZEO+FS to generally outperform MySQLAdapter.
>
> The steamin times are also quite a bit lower for ZEO+FS that for
> mysql.  This is a it surprising since data are simply being read from
> the ZODB object cache, but the overhead of polling for changes slows
> down these accesses.  Ideally, ZEO OBject cache hit rates are high, so
> the steamin times are highly relevent to actual application
> performance.
>
> I shared this data with Shane who suggested running with a poll
> interval of 2.  Here are the results with a poll interval of 2.
>
> Local clients, poll interval 2
> ==============================
>
> ** Results with objects_per_txn=1 **
> 1, 0.00920, 0.00163, 0.00024, 0.00011, 0.00419, 0.00102, 0.00050, 0.00015
> 2, 0.01381, 0.00143, 0.00021, 0.00010, 0.00425, 0.00110, 0.00057, 0.00015
> 4, 0.03010, 0.00153, 0.00015, 0.00007, 0.00505, 0.00123, 0.00051, 0.00013
> 8, 0.06913, 0.00145, 0.00017, 0.00008, 0.01171, 0.00127, 0.00038, 0.00008
> 16, 0.21394, 0.00308, 0.00017, 0.00007, 0.02466, 0.00225, 0.00037, 0.00008
>
> ** Results with objects_per_txn=100 (REALLY 4) **
> 1, 0.01582, 0.00571, 0.00066, 0.00013, 0.00532, 0.00249, 0.00131, 0.00015
> 2, 0.01774, 0.00612, 0.00062, 0.00013, 0.00704, 0.00244, 0.00098, 0.00009
> 4, 0.02779, 0.00710, 0.00055, 0.00012, 0.00741, 0.00384, 0.00143, 0.00009
> 8, 0.08021, 0.01067, 0.00035, 0.00007, 0.01639, 0.00323, 0.00100, 0.00009
> 16, 0.26911, 0.01602, 0.00038, 0.00007, 0.03164, 0.00462, 0.00101, 0.00009
>
> ** Results with objects_per_txn=10000 (REALLY 334) **
> 1, 0.16153, 0.40147, 0.02417, 0.00042, 0.11959, 0.10012, 0.05048, 0.00045
> 2, 0.18652, 0.39361, 0.02055, 0.00044, 0.12947, 0.10604, 0.08080, 0.00047
> 4, 0.33065, 0.84091, 0.02331, 0.00050, 0.25859, 0.21675, 0.13139, 0.00052
> 8, 0.67337, 1.46541, 0.02905, 0.00069, 0.49674, 0.42905, 0.44064, 0.00063
> 16, 1.46586, 3.67101, 0.03427, 0.00097, 0.99446, 1.06484, 1.16689, 0.00078
>
> Here the steamin times are are very similar for ZEO and MySQLAdapter,
> although the ZEO+FS times are a bit lower.  Note however, that using a
> poll interval of 2 may cause excessive conflict errors, especially if
> there are relatively hot objects that get updated a lot.
>
> In our deployments, the clients are on separate machines and generally
> don't compete with each other or with each other for CPU resources.
> The tables blow show results with clients running on a separate 8-core
> 2.33Ghz Xeon (dual quad core) machine with 24G of memory and running
> Centos 4.7.  There was plenty of CPU resources for the clients so they
> never came close to using all of the available CPU resources.
>
> Remote clients, poll interval 2
> ==============================
>
> ** Results with objects_per_txn=1 **
> 1, 0.03733, 0.00207, 0.00015, 0.00007, 0.01905, 0.00240, 0.00141, 0.00008
> 2, 0.01772, 0.00233, 0.00015, 0.00007, 0.01962, 0.00240, 0.00147, 0.00008
> 4, 0.06634, 0.00236, 0.00015, 0.00007, 0.03471, 0.00262, 0.00162, 0.00008
> 8, 0.08080, 0.00364, 0.00016, 0.00007, 0.06410, 0.00287, 0.00164, 0.00008
> 16, 0.09270, 0.00440, 0.00016, 0.00007, 0.13171, 0.00316, 0.00174, 0.00009
>
> ** Results with objects_per_txn=100 (REALLY 4) **
> 1, 0.01809, 0.00683, 0.00034, 0.00007, 0.02432, 0.00597, 0.00480, 0.00008
> 2, 0.02210, 0.00816, 0.00034, 0.00007, 0.02873, 0.00645, 0.00513, 0.00008
> 4, 0.07079, 0.00991, 0.00036, 0.00007, 0.03521, 0.00655, 0.00520, 0.00009
> 8, 0.08739, 0.01388, 0.00035, 0.00007, 0.06754, 0.00706, 0.00557, 0.00009
> 16, 0.09264, 0.01376, 0.00035, 0.00007, 0.13904, 0.00777, 0.00593, 0.00010
>
> ** Results with objects_per_txn=10000 (REALLY 334) **
> 1, 0.17738, 0.57640, 0.01969, 0.00038, 0.61835, 0.47054, 0.39015, 0.00041
> 2, 0.20881, 0.67896, 0.01973, 0.00038, 0.65081, 0.45832, 0.39691, 0.00043
> 4, 0.28996, 0.92163, 0.01993, 0.00038, 0.70280, 0.47962, 0.41136, 0.00044
> 8, 0.41571, 1.25167, 0.02008, 0.00040, 0.81672, 0.50079, 0.50144, 0.00045
> 16, 0.60316, 1.54352, 0.02033, 0.00039, 1.23906, 0.60130, 0.68200, 0.00049
>
>
> Some things to note:
>
> - For smaller transaction sizes, ZEO+FS and MySQLAdapter write times
>   are pretty close, however at higher levels of concurrency or for
>   large transaction sizes, ZEO+FS outperforms MySQLAdapter on writes.
>
> - For smaller transaction sizes, ZEO+FS and MySQLAdapter cold read
>   times are pretty close. Even for larger transaction sizes, the cold
>   read times are pretty close, except at the highest concurrency
>   level.  I think what's happening for high concurrency and large
>   transaction sizes is that ZEO has reached maximum throughput and the
>   MySQLAdapter still has some breathing room.
>
> - The hot times are more than an order of magnitude better for
>   ZEO+FS.
>
> These benchmarks make ZEO+FS look pretty good relative to
> MySQLAdapter.  The overall performance assuming even moderate;y
> effective ZEO pr object caches is significantly better for ZEO.
> Keep in mind, however, that these benchmarks don't take
> disk access on the server into account for reads, because there isn't
> any.  In practice, I'd expect server disk access times to dominate
> cold read times.  For example, in a separate benchmark with far more
> realistic access patterns against a large database, object load times
> are an order of machnitude greater than what you'd see if the data
> being read was all in RAM.
>
> Jim
>
> --
> Jim Fulton
> _______________________________________________
> For more information about ZODB, see the ZODB Wiki:
> http://www.zope.org/Wikis/ZODB/
>
> ZODB-Dev mailing list  -  ZODB-Dev@...
> https://mail.zope.org/mailman/listinfo/zodb-dev
>
_______________________________________________
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@...
https://mail.zope.org/mailman/listinfo/zodb-dev

Re: ZEO and relstporage performance

by Laurence Rowe :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Shane's earlier benchmarks show MySQL to be the fastest RelStorage backend:
http://shane.willowrise.com/archives/relstorage-10-and-measurements/

Laurence

2009/10/13 Ross J. Reedstrom <reedstrm@...>:

> Very interesting. I wonder how the postgresql version fairs?
>
> Ross
>
>
> On Tue, Oct 13, 2009 at 05:08:07PM -0400, Jim Fulton wrote:
>> I've been working on a project to speed up ZEO.  The speedup mainly
>> involves getting ZEO to use more threads by giving each client it's
>> own thread, and changing FileStorage to allow multiple simultaneous
>> readers.  This is especially valuable for us (ZC) for large databases
>> (~1TB) running on multi-splindle storage systems on which multiple
>> reads of the same file can take place in parallel.  I'll have more to
>> say about this work in later posts.
>>
>> In the course of working on this, I decided to play with Shane's
>> relstorage benchmark, speedtest.  After playing with it a bit, I have
>> a few observations.
>>
>> - Up to a point, it does a good job of isolating just the networking
>>   aspects of the mysql and ZEO protocols:
>>
>>   - It uses a small enough data set to fit in ram, so the read portion
>>     of the tests does no disk IO.
>>
>>   - It doesn't leverage ZODB or ZEO caches at all. (Although ZEO read
>>     times are penalized by the time taken to write to the ZEO cache
>>     locally.)
>>
>> - The tests run clients and servers on the same machine using Unix
>>   Domain Sockets for communication (at least for ZEO and MySQL).
>>   Generally, at least in deployments we do, the clients and servers
>>   run on different machines.
>>
>> - When running at high concurrency levels, the clients and server can
>>   compete for CPU recourses, distorting results.  This wouldn't happen
>>   of the clients ran on separate machines.
>>
>> - Minor nit: the tests notion of object's per transaction is off. The
>>   actual number reported is on the order of 1/30 of the numbers the
>>   numbers reported by the tests.
>>
>> I decided to explore this a bit.  I modified shanes speedtest script
>> on a branch:
>>
>> - Added command line options to control a number of factors, like
>>   object sizes and concurrency levels.
>>
>> - Added options to specify mysql connection parameters.  Among other
>>   things, this lets me run the test in a "remote" configuration, in
>>   which the client and server are on different machines.
>>
>> - Added an option to specify a ZEO TCP address and to manage a ZEO
>>   server externally.
>>
>> - Replaced the single read measurement with "cold", "hot" and "steamin"
>>   measurements. The "cold" number is what Shane's test originally called
>>   "read".  It reads data from the server without benefit of the ZODB
>>   or ZEO caches.
>>
>>   The "hot" number provides timings for a second round of reads
>>   after minimizing the object cache.
>>
>>   The "steamin" number is the timing of a 3rd round of reads without
>>   clearing the ZODB cache. I upped the size of the ZODB cache to make
>>   sure the objects woould fit.
>>
>> Here are some results.  I'm going to provide them in tabular form, as
>> I actually find this easier than charts for this data and also because
>> it's less work. :) The results below are basically as output by his
>> script with my modifications.
>>
>> First, here are results from running clients and server on the same
>> machine using unix domain sockets.  The results are grouped onto 3
>> tables based on objects per transaction.  Note that for the second and
>> third tables I've added the actual object counts. The machine these
>> were run on was a 2.2Ghz Intel Core 2 Duo (two core) desktop with a
>> SATA disk and 4GB of ram and running Ubuntu 9.04.  They used
>> relstorage trunk as of October 5, when I made by branch and using ZODB
>> 3.9.1.  The results also reflect the default relstorage poll interval
>> of 0.  More on that later.  The results also reflect mysql
>> configured to improve write performance as described here:
>> http://shane.willowrise.com/archives/how-to-fix-the-mysql-write-speed/.
>>
>> The first column is the concurrency level, which is the number of
>> simultaneous clients.  The remaining columns are in 2 groups of 4, for
>> ZEO and for MySQLAdapter (reslstorage+mysql).  Each group has a write
>> time, a cold read time, a hot read time (second set of reads after
>> clearing the ZODB objects cache) and a steamin time based on a 3rd set
>> of reads without clearing the object cache.
>>
>>
>> Columns:
>> "Concurrency",
>>  ZEO + FileStorage - write,
>>  ZEO + FileStorage - cold,
>>  ZEO + FileStorage - hot,
>>  ZEO + FileStorage - steamin,
>>  MySQLAdapter - write,
>>  MySQLAdapter - cold,
>>  MySQLAdapter - hot,
>>  MySQLAdapter - steamin
>>
>>
>> Local clients, poll interval 0
>> ==============================
>>
>> ** Results with objects_per_txn=1 **
>>    ZEO+FS --------------------------   MySQL-----------------------------
>>    write    cold     hot      steamin  write    cold     hot      steamin
>> 1, 0.00992, 0.00108, 0.00015, 0.00007, 0.00405, 0.00129, 0.00076, 0.00043
>> 2, 0.01359, 0.00177, 0.00024, 0.00011, 0.00635, 0.00083, 0.00043, 0.00024
>> 4, 0.02322, 0.00226, 0.00025, 0.00011, 0.00836, 0.00128, 0.00047, 0.00025
>> 8, 0.07687, 0.00183, 0.00020, 0.00009, 0.01236, 0.00121, 0.00055, 0.00036
>> 16, 0.25414, 0.00259, 0.00018, 0.00007, 0.02846, 0.00130, 0.00056, 0.00032
>>
>> ** Results with objects_per_txn=100 (REALLY 4) **
>>    ZEO+FS --------------------------   MySQL-----------------------------
>>    write    cold     hot      steamin  write    cold     hot      steamin
>> 1, 0.01352, 0.00574, 0.00062, 0.00017, 0.00841, 0.00273, 0.00159, 0.00043
>> 2, 0.02414, 0.00539, 0.00035, 0.00008, 0.00678, 0.00292, 0.00202, 0.00045
>> 4, 0.03136, 0.00789, 0.00035, 0.00007, 0.01343, 0.00198, 0.00108, 0.00025
>> 8, 0.09697, 0.00694, 0.00036, 0.00008, 0.01910, 0.00253, 0.00111, 0.00025
>> 16, 0.24361, 0.01369, 0.00037, 0.00008, 0.03413, 0.00363, 0.00158, 0.00036
>>
>> ** Results with objects_per_txn=10000 (REALLY 334) **
>>    ZEO+FS --------------------------   MySQL-----------------------------
>>    write    cold     hot      steamin  write    cold     hot      steamin
>> 1, 0.13877, 0.40306, 0.02324, 0.00042, 0.11370, 0.09461, 0.05026, 0.00063
>> 2, 0.18004, 0.39529, 0.02051, 0.00045, 0.12573, 0.10313, 0.07746, 0.00072
>> 4, 0.36065, 0.38792, 0.02192, 0.00050, 0.25860, 0.21972, 0.14529, 0.00150
>> 8, 0.68353, 1.57573, 0.02679, 0.00110, 0.51280, 0.44516, 0.45004, 0.00126
>> 16, 1.46470, 3.40687, 0.03225, 0.00057, 1.00606, 1.03924, 1.29605, 0.00102
>>
>> As you can see, write and cold read times are quite a bit higher for
>> ZEO, although write times get closer together as transaction size and
>> concurrency increases.
>>
>> Also note that the hot times are much lower for ZEO than with MySQLAdapter.
>> Our ZEO cache hit rates are typically around 90%.  With a cache hot
>> rate of only 75% I'd expect ZEO+FS to generally outperform MySQLAdapter.
>>
>> The steamin times are also quite a bit lower for ZEO+FS that for
>> mysql.  This is a it surprising since data are simply being read from
>> the ZODB object cache, but the overhead of polling for changes slows
>> down these accesses.  Ideally, ZEO OBject cache hit rates are high, so
>> the steamin times are highly relevent to actual application
>> performance.
>>
>> I shared this data with Shane who suggested running with a poll
>> interval of 2.  Here are the results with a poll interval of 2.
>>
>> Local clients, poll interval 2
>> ==============================
>>
>> ** Results with objects_per_txn=1 **
>> 1, 0.00920, 0.00163, 0.00024, 0.00011, 0.00419, 0.00102, 0.00050, 0.00015
>> 2, 0.01381, 0.00143, 0.00021, 0.00010, 0.00425, 0.00110, 0.00057, 0.00015
>> 4, 0.03010, 0.00153, 0.00015, 0.00007, 0.00505, 0.00123, 0.00051, 0.00013
>> 8, 0.06913, 0.00145, 0.00017, 0.00008, 0.01171, 0.00127, 0.00038, 0.00008
>> 16, 0.21394, 0.00308, 0.00017, 0.00007, 0.02466, 0.00225, 0.00037, 0.00008
>>
>> ** Results with objects_per_txn=100 (REALLY 4) **
>> 1, 0.01582, 0.00571, 0.00066, 0.00013, 0.00532, 0.00249, 0.00131, 0.00015
>> 2, 0.01774, 0.00612, 0.00062, 0.00013, 0.00704, 0.00244, 0.00098, 0.00009
>> 4, 0.02779, 0.00710, 0.00055, 0.00012, 0.00741, 0.00384, 0.00143, 0.00009
>> 8, 0.08021, 0.01067, 0.00035, 0.00007, 0.01639, 0.00323, 0.00100, 0.00009
>> 16, 0.26911, 0.01602, 0.00038, 0.00007, 0.03164, 0.00462, 0.00101, 0.00009
>>
>> ** Results with objects_per_txn=10000 (REALLY 334) **
>> 1, 0.16153, 0.40147, 0.02417, 0.00042, 0.11959, 0.10012, 0.05048, 0.00045
>> 2, 0.18652, 0.39361, 0.02055, 0.00044, 0.12947, 0.10604, 0.08080, 0.00047
>> 4, 0.33065, 0.84091, 0.02331, 0.00050, 0.25859, 0.21675, 0.13139, 0.00052
>> 8, 0.67337, 1.46541, 0.02905, 0.00069, 0.49674, 0.42905, 0.44064, 0.00063
>> 16, 1.46586, 3.67101, 0.03427, 0.00097, 0.99446, 1.06484, 1.16689, 0.00078
>>
>> Here the steamin times are are very similar for ZEO and MySQLAdapter,
>> although the ZEO+FS times are a bit lower.  Note however, that using a
>> poll interval of 2 may cause excessive conflict errors, especially if
>> there are relatively hot objects that get updated a lot.
>>
>> In our deployments, the clients are on separate machines and generally
>> don't compete with each other or with each other for CPU resources.
>> The tables blow show results with clients running on a separate 8-core
>> 2.33Ghz Xeon (dual quad core) machine with 24G of memory and running
>> Centos 4.7.  There was plenty of CPU resources for the clients so they
>> never came close to using all of the available CPU resources.
>>
>> Remote clients, poll interval 2
>> ==============================
>>
>> ** Results with objects_per_txn=1 **
>> 1, 0.03733, 0.00207, 0.00015, 0.00007, 0.01905, 0.00240, 0.00141, 0.00008
>> 2, 0.01772, 0.00233, 0.00015, 0.00007, 0.01962, 0.00240, 0.00147, 0.00008
>> 4, 0.06634, 0.00236, 0.00015, 0.00007, 0.03471, 0.00262, 0.00162, 0.00008
>> 8, 0.08080, 0.00364, 0.00016, 0.00007, 0.06410, 0.00287, 0.00164, 0.00008
>> 16, 0.09270, 0.00440, 0.00016, 0.00007, 0.13171, 0.00316, 0.00174, 0.00009
>>
>> ** Results with objects_per_txn=100 (REALLY 4) **
>> 1, 0.01809, 0.00683, 0.00034, 0.00007, 0.02432, 0.00597, 0.00480, 0.00008
>> 2, 0.02210, 0.00816, 0.00034, 0.00007, 0.02873, 0.00645, 0.00513, 0.00008
>> 4, 0.07079, 0.00991, 0.00036, 0.00007, 0.03521, 0.00655, 0.00520, 0.00009
>> 8, 0.08739, 0.01388, 0.00035, 0.00007, 0.06754, 0.00706, 0.00557, 0.00009
>> 16, 0.09264, 0.01376, 0.00035, 0.00007, 0.13904, 0.00777, 0.00593, 0.00010
>>
>> ** Results with objects_per_txn=10000 (REALLY 334) **
>> 1, 0.17738, 0.57640, 0.01969, 0.00038, 0.61835, 0.47054, 0.39015, 0.00041
>> 2, 0.20881, 0.67896, 0.01973, 0.00038, 0.65081, 0.45832, 0.39691, 0.00043
>> 4, 0.28996, 0.92163, 0.01993, 0.00038, 0.70280, 0.47962, 0.41136, 0.00044
>> 8, 0.41571, 1.25167, 0.02008, 0.00040, 0.81672, 0.50079, 0.50144, 0.00045
>> 16, 0.60316, 1.54352, 0.02033, 0.00039, 1.23906, 0.60130, 0.68200, 0.00049
>>
>>
>> Some things to note:
>>
>> - For smaller transaction sizes, ZEO+FS and MySQLAdapter write times
>>   are pretty close, however at higher levels of concurrency or for
>>   large transaction sizes, ZEO+FS outperforms MySQLAdapter on writes.
>>
>> - For smaller transaction sizes, ZEO+FS and MySQLAdapter cold read
>>   times are pretty close. Even for larger transaction sizes, the cold
>>   read times are pretty close, except at the highest concurrency
>>   level.  I think what's happening for high concurrency and large
>>   transaction sizes is that ZEO has reached maximum throughput and the
>>   MySQLAdapter still has some breathing room.
>>
>> - The hot times are more than an order of magnitude better for
>>   ZEO+FS.
>>
>> These benchmarks make ZEO+FS look pretty good relative to
>> MySQLAdapter.  The overall performance assuming even moderate;y
>> effective ZEO pr object caches is significantly better for ZEO.
>> Keep in mind, however, that these benchmarks don't take
>> disk access on the server into account for reads, because there isn't
>> any.  In practice, I'd expect server disk access times to dominate
>> cold read times.  For example, in a separate benchmark with far more
>> realistic access patterns against a large database, object load times
>> are an order of machnitude greater than what you'd see if the data
>> being read was all in RAM.
>>
>> Jim
>>
>> --
>> Jim Fulton
>> _______________________________________________
>> For more information about ZODB, see the ZODB Wiki:
>> http://www.zope.org/Wikis/ZODB/
>>
>> ZODB-Dev mailing list  -  ZODB-Dev@...
>> https://mail.zope.org/mailman/listinfo/zodb-dev
>>
> _______________________________________________
> For more information about ZODB, see the ZODB Wiki:
> http://www.zope.org/Wikis/ZODB/
>
> ZODB-Dev mailing list  -  ZODB-Dev@...
> https://mail.zope.org/mailman/listinfo/zodb-dev
>
_______________________________________________
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@...
https://mail.zope.org/mailman/listinfo/zodb-dev

Re: ZEO and relstporage performance

by Shane Hathaway :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Jim Fulton wrote:
> These benchmarks make ZEO+FS look pretty good relative to
> MySQLAdapter.  The overall performance assuming even moderate;y
> effective ZEO pr object caches is significantly better for ZEO.

This is an excellent analysis.  It pointed out some rough edges in
RelStorage performance.  On the RelStorage trunk, I have filled in a lot
of the performance gaps.  Specifically:

- When writing to the database, RelStorage issued a SQL statement for
every object stored, causing a network round trip for every object
written.  I switched it to use multi-row insert statements.

- Allocating a new OID also caused a network round trip.  RelStorage now
allocates blocks of OIDs.

- I improved the way RelStorage uses memcache.  I reduced the number of
memcache trips required.  The RelStorage equivalent of the ZEO cache is
memcache, but memcache wasn't enabled in Jim's tests.  That's ok;
memcache isn't currently as easy to set up as it should be.

I have also turned the speedtest script into a tool for comparing the
performance of different ZODB storages with different settings.  Then I
made a buildout.cfg that installs a copy of mysql, postgresql, memcache,
and all the necessary adapters so people can easily run tests on their
own.  I have not released these yet, but I will soon.

Shane
_______________________________________________
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@...
https://mail.zope.org/mailman/listinfo/zodb-dev

Re: ZEO and relstporage performance

by Shane Hathaway :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Laurence Rowe wrote:
> Shane's earlier benchmarks show MySQL to be the fastest RelStorage backend:
> http://shane.willowrise.com/archives/relstorage-10-and-measurements/

Yep, despite my efforts to put PostgreSQL on top. :-)  It seems that
PostgreSQL has more predictable performance and behavior, while MySQL
wins slightly in raw performance once every surprisingly slow query has
been optimized.

Thanks to Jim for sending me preliminary results a week ago so I could
have time to think about it and respond properly. :-)

Still, I think it's fair to point out an issue I discovered in Jim's
tests.  When Jim expanded my read test to include hot and steamin'
times, he did not do anything to synchronize the execution of the two
new test phases.  I can understand that, since the existing structure of
speedtest.py makes it hard to add synchronization.  However, I
discovered that the lack of synchronization inflated the "hot" scores by
a large factor (I've seen up to 4X) under high concurrency, since the
CPU ends up executing the supposedly concurrent tests at different
times.  Therefore, the ZEO cache isn't quite as good as Jim's numbers
suggest.

Even after optimizing the RelStorage/memcached integration, however, my
own tests show the maximum performance of the ZEO cache is 2 or 3 times
as fast as the RelStorage/memcached code.  For grins, I closed that gap
using a ZODB configuration with a fake memcached that stores object
states in a Python dictionary.

This leads to an interesting question.  Memcached or ZEO cache--which is
better?  While memcached has a higher minimum performance penalty, it
also has a lower maximum penalty, since memcached hits never have to
wait for disk.  Also, memcached can be shared among processes, there is
a large development community around memcached, and memcached creates
opportunities for developers to be creative with caching strategies.

So I'm inclined to stick with memcached even though the ZEO cache
numbers look better.

Shane
_______________________________________________
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@...
https://mail.zope.org/mailman/listinfo/zodb-dev

Re: ZEO and relstporage performance

by Ross J. Reedstrom :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, Oct 13, 2009 at 06:30:31PM -0600, Shane Hathaway wrote:
> Laurence Rowe wrote:
> > Shane's earlier benchmarks show MySQL to be the fastest RelStorage backend:
> > http://shane.willowrise.com/archives/relstorage-10-and-measurements/
>
> Yep, despite my efforts to put PostgreSQL on top. :-)  It seems that
> PostgreSQL has more predictable performance and behavior, while MySQL
> wins slightly in raw performance once every surprisingly slow query has
> been optimized.

The usual wisdom on that is MySQL is faster at raw table reading,
PostgreSQL at concurrency, esp. w/ any writing thrown in. Often, the
performance curves 'cross' at some point. Could be > 16 in this case.

For my druthers, I just _trust_ PostgreSQL a lot more: the one MySQL DB
I have that I use on an on-going basis is my MythTV media-pc. Once a
month or so I have to repair a table. I've used PostgreSQL in
professional high load production for years and never had a corruption
issue: even when running out of disk! (the main culprit in my experience
w/ MySQL, since that's the default state for a PVR: full!)

<snip caching discussion>

>
> This leads to an interesting question.  Memcached or ZEO cache--which is
> better?  While memcached has a higher minimum performance penalty, it
> also has a lower maximum penalty, since memcached hits never have to
> wait for disk.  Also, memcached can be shared among processes, there is
> a large development community around memcached, and memcached creates
> opportunities for developers to be creative with caching strategies.

shared caches: this is the main reason I've been looking at relstore:
we're running many Zope FEs against one ZOE right now, and due to the
nature of the load-balancer, we're seeing little gain from the caches.
I'm looking to fix that issue, to some extent, but sharing across all
the FEs on one box would be a big win, I'm sure.

> So I'm inclined to stick with memcached even though the ZEO cache
> numbers look better.

Ross
--
Ross Reedstrom, Ph.D.                                 reedstrm@...
Systems Engineer & Admin, Research Scientist        phone: 713-348-6166
The Connexions Project      http://cnx.org            fax: 713-348-3665
Rice University MS-375, Houston, TX 77005
GPG Key fingerprint = F023 82C8 9B0E 2CC6 0D8E  F888 D3AE 810E 88F0 BEDE
_______________________________________________
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@...
https://mail.zope.org/mailman/listinfo/zodb-dev

Re: ZEO and relstporage performance

by Shane Hathaway :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Ross J. Reedstrom wrote:
> For my druthers, I just _trust_ PostgreSQL a lot more: the one MySQL DB
> I have that I use on an on-going basis is my MythTV media-pc. Once a
> month or so I have to repair a table. I've used PostgreSQL in
> professional high load production for years and never had a corruption
> issue: even when running out of disk! (the main culprit in my experience
> w/ MySQL, since that's the default state for a PVR: full!)

I've had similar experiences.

> shared caches: this is the main reason I've been looking at relstore:
> we're running many Zope FEs against one ZOE right now, and due to the
> nature of the load-balancer, we're seeing little gain from the caches.
> I'm looking to fix that issue, to some extent, but sharing across all
> the FEs on one box would be a big win, I'm sure.

I hope so.  I must admit that I don't have great confidence in the
probable cache hit rate of the current RelStorage/memcached strategy.  I
do have a lot of hope that it can be improved, possibly by adding to
memcached.

Shane
_______________________________________________
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@...
https://mail.zope.org/mailman/listinfo/zodb-dev

Re: ZEO and relstporage performance

by Benji York-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Oct 14, 2009 at 1:08 PM, Ross J. Reedstrom <reedstrm@...> wrote:
> shared caches: this is the main reason I've been looking at relstore:
> we're running many Zope FEs against one ZOE right now, and due to the
> nature of the load-balancer, we're seeing little gain from the caches.
> I'm looking to fix that issue, to some extent, but sharing across all
> the FEs on one box would be a big win, I'm sure.

For similar reasons I've been considering various affinity approaches
lately.  Most people are familiar with session affinity, but I'm
thinking of something more like "data" affinity.

Instead of having a big cache that is shared in order to increase the
chance of a request's data being in the cache, you would instead have
many smaller caches (just like ZEO works now) and send the requests to
the process(es) that are most likely to have the appropriate data in
their cache.
--
Benji York
Senior Software Engineer
Zope Corporation
_______________________________________________
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@...
https://mail.zope.org/mailman/listinfo/zodb-dev

Re: ZEO and relstporage performance

by Ross J. Reedstrom :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Oct 14, 2009 at 02:20:50PM -0400, Benji York wrote:

> On Wed, Oct 14, 2009 at 1:08 PM, Ross J. Reedstrom <reedstrm@...> wrote:
> > shared caches: this is the main reason I've been looking at relstore:
> > we're running many Zope FEs against one ZOE right now, and due to the
> > nature of the load-balancer, we're seeing little gain from the caches.
> > I'm looking to fix that issue, to some extent, but sharing across all
> > the FEs on one box would be a big win, I'm sure.
>
> For similar reasons I've been considering various affinity approaches
> lately.  Most people are familiar with session affinity, but I'm
> thinking of something more like "data" affinity.
>
> Instead of having a big cache that is shared in order to increase the
> chance of a request's data being in the cache, you would instead have
> many smaller caches (just like ZEO works now) and send the requests to
> the process(es) that are most likely to have the appropriate data in
> their cache.

We're actually set up w/ squid in front of the zope FEs, using IPC to
talk to them all. The default behavior is just to respond w/ a "CACHE
MISS" and use network access timings to select. This is non-optimal,
since it does little true load-balancing, until the FE is completely
hammered (very non-linear response-time curve). I'd love to see an
example where someone replaced the default response w/ something more
meaningful. The shoal that replying "HIT' for ZEO cached data breaks on
is that the IPC request contains a URL, not ZODB object refs. And
converting one to the other is what the whole dang machine _does_.
And you have very little time to answer that IPC query, lest you destroy
the gains you hope to get from having a hot-cache. So some ad-hoc
approximation, like keeping the last couple hundred URLs served, and
responding 'HIT' for those, might get some part of the benefit.

This is probably the wrong list for it, but does anyone know of a
published example of replacing Zope's default icp-server response? Last
time I looked, I couldn't find one.

Ross
--
Ross Reedstrom, Ph.D.                                 reedstrm@...
Systems Engineer & Admin, Research Scientist        phone: 713-348-6166
The Connexions Project      http://cnx.org            fax: 713-348-3665
Rice University MS-375, Houston, TX 77005
GPG Key fingerprint = F023 82C8 9B0E 2CC6 0D8E  F888 D3AE 810E 88F0 BEDE
_______________________________________________
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@...
https://mail.zope.org/mailman/listinfo/zodb-dev

Re: ZEO and relstporage performance

by Benji York-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Oct 14, 2009 at 3:38 PM, Ross J. Reedstrom <reedstrm@...> wrote:
> On Wed, Oct 14, 2009 at 02:20:50PM -0400, Benji York wrote:
> We're actually set up w/ squid in front of the zope FEs, using IPC to
> talk to them all. The default behavior is just to respond w/ a "CACHE
> MISS" and use network access timings to select. This is non-optimal,
> since it does little true load-balancing, until the FE is completely
> hammered (very non-linear response-time curve). I'd love to see an
> example where someone replaced the default response w/ something more
> meaningful.

You'd probably be interested in the last four paragraphs of
http://benjiyork.com/blog/2008/03/icp-for-faster-web-apps.html.

> The shoal that replying "HIT' for ZEO cached data breaks on
> is that the IPC request contains a URL, not ZODB object refs. And
> converting one to the other is what the whole dang machine _does_.

Right, but in many situations you can guess effectively.  See below.

> And you have very little time to answer that IPC query, lest you destroy
> the gains you hope to get from having a hot-cache. So some ad-hoc
> approximation, like keeping the last couple hundred URLs served, and
> responding 'HIT' for those, might get some part of the benefit.

I'm thinking about something more specific.

For example, lets say that your data is a dictionary application and
URLs are of the structure: my-site.com/WORD, where word is the word you
want to show the definition of.

If you generate an IPC HIT if the process has served a word beginning
with the same letter recently, then particular letters will begin to be
associated with particular servers, increasing the ZEO (and ZODB) cache
hit rates.

If you take this a step further and design your app so that you load all
the words that start with a letter as one giant object, you'll get an
even greater effect.  (Note that in real life that object would be too
giant, but you get the idea.)

> This is probably the wrong list for it, but does anyone know of a
> published example of replacing Zope's default icp-server response? Last
> time I looked, I couldn't find one.

Other than my blog post and zc.icp the only other ICP and Zope info I
know of is http://www.zope.org/Members/htrd/icp/intro (very old).

Note that I haven't used zc.icp in a long time because we decided to
move away from ICP and (eventually) move to a load balancer that
implements our desired policy instead.
--
Benji York
Senior Software Engineer
Zope Corporation
_______________________________________________
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@...
https://mail.zope.org/mailman/listinfo/zodb-dev