|
View:
New views
15 Messages
—
Rating Filter:
Alert me
|
|
|
performance on large systemsDoes anyone have any experience on large Kronolith installs? I'm
starting to run through imports of data, getting ready for a soft roll-out at the end of this month and I'm finding that there's some really slow SQL being executed. Loading the main kronolith page takes in the neighborhood of 4-6 seconds on a fairly beefy pgsql box and that will likely be a show-stopper. By large, I mean somewhere in the neighborhood of 500,000 calendars. With that amount of data, I'm thinking the regular expressions in queries such as this one are problematic: SELECT DISTINCT s.* FROM kronolith_shares s LEFT JOIN kronolith_shares_users AS u ON u.share_id = s.share_id LEFT JOIN kronolith_shares_groups AS g ON g.share_id = s.share_id WHERE s.share_owner = 'username' OR (CASE WHEN CAST(s.perm_creator AS VARCHAR) ~ '^-?[0-9]+$' THEN (CAST(s.perm_creator AS INTEGER) & 2) <> 0 ELSE FALSE END) OR (CASE WHEN CAST(s.perm_default AS VARCHAR) ~ '^-?[0-9]+$' THEN (CAST(s.perm_default AS INTEGER) & 2) <> 0 ELSE FALSE END) OR ( u.user_uid ='username' AND (CASE WHEN CAST(u.perm AS VARCHAR) ~ '^-?[0-9]+$' THEN (CAST(u.perm AS INTEGER) & 2) <> 0 ELSE FALSE END)) OR (g.group_uid IN (139204) AND (CASE WHEN CAST(g.perm AS VARCHAR) ~ '^-?[0-9]+$' THEN (CAST(g.perm AS INTEGER) & 2) <> 0 ELSE FALSE END)) ORDER BY s.attribute_name ASC; ...Note that with that 4-6 second wall time I've as yet only loaded about 120,000 calendars, so I do expect this problem to get worse. Why the regexp's? Can they be eliminated? Anything else I should look for performance-wise? EXPLAIN for the above query: Unique (cost=51593.93..54666.63 rows=122908 width=580) (actual time=1451.643..1451.660 rows=3 loops=1) -> Sort (cost=51593.93..51901.20 rows=122908 width=580) (actual time=1451.639..1451.643 rows=3 loops=1) Sort Key: s.attribute_name, s.share_id, s.share_name, s.share_owner, s.share_flags, s.perm_creator, s.perm_default, s.perm_guest, s.attribute_desc -> Merge Left Join (cost=0.00..8853.26 rows=122908 width=580) (actual time=0.074..1451.603 rows=3 loops=1) Merge Cond: (s.share_id = g.share_id) Filter: (((s.share_owner)::text = 'jmadden'::text) OR CASE WHEN (((s.perm_creator)::character varying)::text ~ '^-?[0-9]+$'::text) THEN (((s.perm_creator)::integer & 2) <> 0) ELSE false END OR CASE WHEN (((s.perm_default)::character varying)::text ~ '^-?[0-9]+$'::text) THEN (((s.perm_default)::integer & 2) <> 0) ELSE false END OR (((u.user_uid)::text = 'jmadden'::text) AND CASE WHEN (((u.perm)::character varying)::text ~ '^-?[0-9]+$'::text) THEN (((u.perm)::integer & 2) <> 0) ELSE false END) OR (((g.group_uid)::text = '139204'::text) AND CASE WHEN (((g.perm)::character varying)::text ~ '^-?[0-9]+$'::text) THEN (((g.perm)::integer & 2) <> 0) ELSE false END)) -> Merge Left Join (cost=0.00..6371.65 rows=122908 width=593) (actual time=0.053..737.928 rows=126130 loops=1) Merge Cond: (s.share_id = u.share_id) -> Index Scan using kronolith_shares_pkey on kronolith_shares s (cost=0.00..5020.51 rows=122908 width=580) (actual time=0.028..233.162 rows=123020 loops=1) -> Index Scan using kronolith_shares_users_share_id_idx on kronolith_shares_users u (cost=0.00..801.21 rows=19413 width=17) (actual time=0.014..37.860 rows=19409 loops=1) -> Index Scan using kronolith_shares_groups_share_id_idx on kronolith_shares_groups g (cost=0.00..925.62 rows=17839 width=16) (actual time=0.016..48.678 rows=20911 loops=1) Total runtime: 1451.774 ms Thanks, John -- John Madden Sr UNIX Systems Engineer Ivy Tech Community College of Indiana jmadden@... -- Kronolith mailing list - Join the hunt: http://horde.org/bounties/#kronolith Frequently Asked Questions: http://horde.org/faq/ To unsubscribe, mail: kronolith-unsubscribe@... |
|
|
Re: performance on large systemsQuoting John Madden <jmadden@...>:
> Does anyone have any experience on large Kronolith installs? I'm > starting to run through imports of data, getting ready for a soft > roll-out at the end of this month and I'm finding that there's some > really slow SQL being executed. Loading the main kronolith page > takes in the neighborhood of 4-6 seconds on a fairly beefy pgsql box > and that will likely be a show-stopper. > > By large, I mean somewhere in the neighborhood of 500,000 calendars. > With that amount of data, I'm thinking the regular expressions in > queries such as this one are problematic: For that number of calendars, with the current share drivers, you probably want to turn off calendar sharing entirely. People will still be able to share events via invitations, and we're working on a faster sharing driver. -chuck -- Kronolith mailing list - Join the hunt: http://horde.org/bounties/#kronolith Frequently Asked Questions: http://horde.org/faq/ To unsubscribe, mail: kronolith-unsubscribe@... |
|
|
Re: performance on large systems>> By large, I mean somewhere in the neighborhood of 500,000 calendars.
>> With that amount of data, I'm thinking the regular expressions in >> queries such as this one are problematic: > > For that number of calendars, with the current share drivers, you > probably want to turn off calendar sharing entirely. People will still > be able to share events via invitations, and we're working on a faster > sharing driver. Calendar sharing is a pretty critical component of what we're doing (and why we're implementing Horde to begin with). Is there anything I can do to help? John -- John Madden Sr UNIX Systems Engineer Ivy Tech Community College of Indiana jmadden@... -- Kronolith mailing list - Join the hunt: http://horde.org/bounties/#kronolith Frequently Asked Questions: http://horde.org/faq/ To unsubscribe, mail: kronolith-unsubscribe@... |
|
|
Re: performance on large systems>> For that number of calendars, with the current share drivers, you
>> probably want to turn off calendar sharing entirely. People will still >> be able to share events via invitations, and we're working on a faster >> sharing driver. > > Calendar sharing is a pretty critical component of what we're doing (and > why we're implementing Horde to begin with). Is there anything I can do > to help? Hmm. I disabled sharing and found that the performance issue is the same and the SQL statements being executed don't appear to be any different. Is sharing a red herring? The SQL I sent before seems clunky (at best) to me -- why the CASTs and regexps? Can those be eliminated? John -- John Madden Sr UNIX Systems Engineer Ivy Tech Community College of Indiana jmadden@... -- Kronolith mailing list - Join the hunt: http://horde.org/bounties/#kronolith Frequently Asked Questions: http://horde.org/faq/ To unsubscribe, mail: kronolith-unsubscribe@... |
|
|
Re: performance on large systemsQuoting John Madden <jmadden@...>:
>>> By large, I mean somewhere in the neighborhood of 500,000 >>> calendars. With that amount of data, I'm thinking the regular >>> expressions in queries such as this one are problematic: >> >> For that number of calendars, with the current share drivers, you >> probably want to turn off calendar sharing entirely. People will >> still be able to share events via invitations, and we're working >> on a faster sharing driver. > > Calendar sharing is a pretty critical component of what we're doing > (and why we're implementing Horde to begin with). Is there anything > I can do to help? This is the open bug: http://bugs.horde.org/ticket/7363 If you have optimized SQL queries for PostgreSQL, it would be great to get this sorted once and for all. -chuck -- Kronolith mailing list - Join the hunt: http://horde.org/bounties/#kronolith Frequently Asked Questions: http://horde.org/faq/ To unsubscribe, mail: kronolith-unsubscribe@... |
|
|
Re: performance on large systemsQuoting John Madden <jmadden@...>:
>>> For that number of calendars, with the current share drivers, you >>> probably want to turn off calendar sharing entirely. People will >>> still be able to share events via invitations, and we're working >>> on a faster sharing driver. >> >> Calendar sharing is a pretty critical component of what we're doing >> (and why we're implementing Horde to begin with). Is there >> anything I can do to help? > > Hmm. I disabled sharing and found that the performance issue is the > same and the SQL statements being executed don't appear to be any > different. Is sharing a red herring? The SQL I sent before seems > clunky (at best) to me -- why the CASTs and regexps? Can those be > eliminated? Possibly - I don't know enough about postgres to know if it can be done better. What config var did you disable? I meant disabling the shares driver entirely (setting it to null). -chuck -- Kronolith mailing list - Join the hunt: http://horde.org/bounties/#kronolith Frequently Asked Questions: http://horde.org/faq/ To unsubscribe, mail: kronolith-unsubscribe@... |
|
|
Re: performance on large systems> What config var did you disable? I meant disabling the shares driver
> entirely (setting it to null). I hadn't disabled the driver, just set the no_sharing variable. It can't be disabled from admin/setup but doing it to the config file manually did disable sharing and it is a HUGE improvement. Looking into the SQL. John -- John Madden Sr UNIX Systems Engineer Ivy Tech Community College of Indiana jmadden@... -- Kronolith mailing list - Join the hunt: http://horde.org/bounties/#kronolith Frequently Asked Questions: http://horde.org/faq/ To unsubscribe, mail: kronolith-unsubscribe@... |
|
|
Re: performance on large systems> This is the open bug:
> http://bugs.horde.org/ticket/7363 > > If you have optimized SQL queries for PostgreSQL, it would be great to > get this sorted once and for all. I looked at the code for this and it's really quite a mess -- a standard driver for sharing across all Horde components is a great idea until you need to optimize it for particular use cases. Worse, the bigger queries are running really big JOINs that take the bulk of the execution time. I'm thinking the driver will have to be written to somehow avoid those or at least reduce the number of rows involved before the JOIN happens. John -- John Madden Sr UNIX Systems Engineer Ivy Tech Community College of Indiana jmadden@... -- Kronolith mailing list - Join the hunt: http://horde.org/bounties/#kronolith Frequently Asked Questions: http://horde.org/faq/ To unsubscribe, mail: kronolith-unsubscribe@... |
|
|
Re: performance on large systems> This is the open bug:
> http://bugs.horde.org/ticket/7363 > > If you have optimized SQL queries for PostgreSQL, it would be great to > get this sorted once and for all. I'm thinking about shoe-horning this into the kronolith code: SELECT DISTINCT s.* FROM kronolith_shares s WHERE s.share_id IN (SELECT share_id FROM kronolith_shares_groups WHERE group_uid IN (SELECT group_uid FROM horde_groups_members WHERE user_uid='jmadden') AND perm & 2 <> 0) OR s.share_id IN (SELECT s.share_id FROM kronolith_shares s WHERE s.share_id IN (SELECT share_id FROM kronolith_shares_users WHERE user_uid='jmadden' AND perm & 2 <> 0)) OR s.share_id IN (SELECT s.share_id FROM kronolith_shares s WHERE s.share_owner='jmadden'); In my testing so far that's producing the same output as the stock query but it's doing so in ~65ms instead of ~1200ms. Obviously I'd have to fit that to match the string substitutions, but other than that, would the above work? John -- John Madden Sr UNIX Systems Engineer Ivy Tech Community College of Indiana jmadden@... -- Kronolith mailing list - Join the hunt: http://horde.org/bounties/#kronolith Frequently Asked Questions: http://horde.org/faq/ To unsubscribe, mail: kronolith-unsubscribe@... |
|
|
Re: performance on large systems> This is the open bug:
> http://bugs.horde.org/ticket/7363 > > If you have optimized SQL queries for PostgreSQL, it would be great to > get this sorted once and for all. Patch attached. Let me know if you feel it worthy for inclusion on the bug report and I'll clean it up to remove stuff rather than commenting it out, etc. Initial testing shows massive improvements (kronolith page loading the week view in ~1-2 seconds vs 5-7 seconds) with sharing turned on. It's a hefty change to the logic so it certainly needs someone with a better Horde background to look it over. It skips over the intelligence of Horde's DB agnostics but should work on anything that supports joins and sub-selects. John -- John Madden Sr UNIX Systems Engineer Ivy Tech Community College of Indiana jmadden@... --- sql.php.orig 2009-05-01 06:12:36.000000000 -0400 +++ sql.php 2009-11-17 17:37:41.000000000 -0500 @@ -3,7 +3,7 @@ * Horde_Share_sql:: provides the sql backend for the horde share * driver. * - * $Horde: framework/Share/Share/sql.php,v 1.1.2.56 2009/04/20 21:04:35 jan Exp $ + * $Horde: framework/Share/Share/sql.php,v 1.1.2.57 2009/08/12 13:52:00 jan Exp $ * * Copyright 2008-2009 The Horde Project (http://www.horde.org/) * @@ -669,33 +669,37 @@ $where .= 's.share_owner = ' . $this->_write_db->quote($userid); // (name == perm_creator and val & $perm) - $where .= ' OR (' . Horde_SQL::buildClause($this->_db, 's.perm_creator', '&', $perm) . ')'; +// $where .= ' OR (' . Horde_SQL::buildClause($this->_db, 's.perm_creator', '&', $perm) . ')'; +// IVT hack: group viewable +$where .= ' OR s.share_id IN(SELECT share_id FROM ' . $this->_table . '_groups WHERE group_uid IN (SELECT group_uid FROM horde_groups_members WHERE user_uid=\'' . $userid . '\') AND perm & ' . $perm . ' <> 0)'; // (name == perm_creator and val & $perm) - $where .= ' OR (' . Horde_SQL::buildClause($this->_db, 's.perm_default', '&', $perm) . ')'; +// $where .= ' OR (' . Horde_SQL::buildClause($this->_db, 's.perm_default', '&', $perm) . ')'; +// IVT hack: user assigned viewable +$where .= ' OR s.share_id IN(SELECT s.share_id FROM '. $this->_table . ' s WHERE s.share_id IN (SELECT share_id FROM kronolith_shares_users WHERE user_uid=\'' . $userid . '\' AND perm & ' . $perm . ' <> 0))'; // (name == perm_users and key == $userid and val & $perm) - $query .= ' LEFT JOIN ' . $this->_table . '_users AS u ON u.share_id = s.share_id'; - $where .= ' OR ( u.user_uid = ' . $this->_write_db->quote($userid) - . ' AND (' . Horde_SQL::buildClause($this->_db, 'u.perm', '&', $perm) . '))'; +// $query .= ' LEFT JOIN ' . $this->_table . '_users AS u ON u.share_id = s.share_id'; +// $where .= ' OR ( u.user_uid = ' . $this->_write_db->quote($userid) +// . ' AND (' . Horde_SQL::buildClause($this->_db, 'u.perm', '&', $perm) . '))'; // If the user has any group memberships, check for those also. - require_once 'Horde/Group.php'; - $group = &Group::singleton(); - $groups = $group->getGroupMemberships($userid, true); - if (!is_a($groups, 'PEAR_Error') && $groups) { - // (name == perm_groups and key in ($groups) and val & $perm) - $ids = array_keys($groups); - $group_ids = array(); - foreach ($ids as $id) { - $group_ids[] = $this->_db->quote($id); - } - $query .= ' LEFT JOIN ' . $this->_table . '_groups AS g ON g.share_id = s.share_id'; - $where .= ' OR (g.group_uid IN (' . implode(',', $group_ids) . ')' - . ' AND (' . Horde_SQL::buildClause($this->_db, 'g.perm', '&', $perm) . '))'; - } elseif (is_a($groups, 'PEAR_Error')) { - Horde::logMessage($groups, __FILE__, __LINE__, PEAR_LOG_ERR); - } +// require_once 'Horde/Group.php'; +// $group = &Group::singleton(); +// $groups = $group->getGroupMemberships($userid, true); +// if (!is_a($groups, 'PEAR_Error') && $groups) { +// // (name == perm_groups and key in ($groups) and val & $perm) +// $ids = array_keys($groups); +// $group_ids = array(); +// foreach ($ids as $id) { +// $group_ids[] = $this->_db->quote($id); +// } +// $query .= ' LEFT JOIN ' . $this->_table . '_groups AS g ON g.share_id = s.share_id'; +// $where .= ' OR (g.group_uid IN (' . implode(',', $group_ids) . ')' +// . ' AND (' . Horde_SQL::buildClause($this->_db, 'g.perm', '&', $perm) . '))'; +// } elseif (is_a($groups, 'PEAR_Error')) { +// Horde::logMessage($groups, __FILE__, __LINE__, PEAR_LOG_ERR); +// } } else { $where = '(' . Horde_SQL::buildClause($this->_db, 's.perm_guest', '&', $perm) . ')'; } @@ -751,7 +755,7 @@ require_once 'MDB2.php'; $params = $this->_params; unset($params['charset']); - $this->_write_db = MDB2::factory($params); + $this->_write_db = &MDB2::factory($params); if (is_a($this->_write_db, 'PEAR_Error')) { Horde::fatal($this->_write_db, __FILE__, __LINE__); } -- Kronolith mailing list - Join the hunt: http://horde.org/bounties/#kronolith Frequently Asked Questions: http://horde.org/faq/ To unsubscribe, mail: kronolith-unsubscribe@... |
|
|
Re: performance on large systemsZitat von John Madden <jmadden@...>:
>> This is the open bug: >> http://bugs.horde.org/ticket/7363 >> >> If you have optimized SQL queries for PostgreSQL, it would be great >> to get this sorted once and for all. > > Patch attached. Let me know if you feel it worthy for inclusion on > the bug report and I'll clean it up to remove stuff rather than > commenting it out, etc. Initial testing shows massive improvements > (kronolith page loading the week view in ~1-2 seconds vs 5-7 > seconds) with sharing turned on. It's a hefty change to the logic > so it certainly needs someone with a better Horde background to look > it over. It skips over the intelligence of Horde's DB agnostics but > should work on anything that supports joins and sub-selects. A few remarks: - Please make this patch against the latest version from CVS, there seem to be a revision number mismatch in the file header. - Make this configurable, at least in the driver code, so that people that still have databases that don't support sub-queries (shouldn't be many) can turn the old behavior on. - You can't JOIN with horde_groups, because the groups could come from any group driver, so you have to keep the old way of getting group information. Jan. -- Do you need professional PHP or Horde consulting? http://horde.org/consulting/ -- Kronolith mailing list - Join the hunt: http://horde.org/bounties/#kronolith Frequently Asked Questions: http://horde.org/faq/ To unsubscribe, mail: kronolith-unsubscribe@... |
|
|
Re: performance on large systems> - Make this configurable, at least in the driver code, so that people
> that still have databases that don't support sub-queries (shouldn't be > many) can turn the old behavior on. If I have the time to circle back on this, I'll give it a go. > - You can't JOIN with horde_groups, because the groups could come from > any group driver, so you have to keep the old way of getting group > information. Same as above, although maybe a bit easier. If the groups are in LDAP though, for example, would any of this be relevant? You can't JOIN against that anyway. In our case, we use SQL for everything, so this is at least a starting point for whoever takes the next step. I don't mean for the patch to replace the sharing driver itself, but the concept here is scalability, which I think it certainly demonstrates for anyone more familiar with Horde internals. John -- John Madden Sr UNIX Systems Engineer Ivy Tech Community College of Indiana jmadden@... -- Kronolith mailing list - Join the hunt: http://horde.org/bounties/#kronolith Frequently Asked Questions: http://horde.org/faq/ To unsubscribe, mail: kronolith-unsubscribe@... |
|
|
Re: performance on large systemsZitat von John Madden <jmadden@...>:
>> - Make this configurable, at least in the driver code, so that >> people that still have databases that don't support sub-queries >> (shouldn't be many) can turn the old behavior on. > > If I have the time to circle back on this, I'll give it a go. > >> - You can't JOIN with horde_groups, because the groups could come >> from any group driver, so you have to keep the old way of getting >> group information. > > Same as above, although maybe a bit easier. If the groups are in > LDAP though, for example, would any of this be relevant? You can't > JOIN against that anyway. Exactly, that's what I'm saying. > In our case, we use SQL for everything, so this is at least a > starting point for whoever takes the next step. I don't mean for > the patch to replace the sharing driver itself, but the concept here > is scalability, which I think it certainly demonstrates for anyone > more familiar with Horde internals. > > John > > > > > -- > John Madden > Sr UNIX Systems Engineer > Ivy Tech Community College of Indiana > jmadden@... > Jan. -- Do you need professional PHP or Horde consulting? http://horde.org/consulting/ -- Kronolith mailing list - Join the hunt: http://horde.org/bounties/#kronolith Frequently Asked Questions: http://horde.org/faq/ To unsubscribe, mail: kronolith-unsubscribe@... |
|
|
Re: performance on large systemsQuoting John Madden <jmadden@...>:
>> This is the open bug: >> http://bugs.horde.org/ticket/7363 >> >> If you have optimized SQL queries for PostgreSQL, it would be great >> to get this sorted once and for all. > > I'm thinking about shoe-horning this into the kronolith code: > > SELECT DISTINCT s.* FROM kronolith_shares s WHERE s.share_id IN > > (SELECT share_id FROM kronolith_shares_groups WHERE group_uid IN > (SELECT group_uid FROM horde_groups_members WHERE > user_uid='jmadden') AND perm & 2 <> 0) > > OR s.share_id IN > > (SELECT s.share_id FROM kronolith_shares s WHERE s.share_id IN > (SELECT share_id FROM kronolith_shares_users WHERE > user_uid='jmadden' AND perm & 2 <> 0)) > > OR s.share_id IN > > (SELECT s.share_id FROM kronolith_shares s WHERE s.share_owner='jmadden'); > > In my testing so far that's producing the same output as the stock > query but it's doing so in ~65ms instead of ~1200ms. Obviously I'd > have to fit that to match the string substitutions, but other than > that, would the above work? You said you're using Postgres, right? If that's fast for postgres that's great; I haven't tried it yet (I'll try your patch later), but the subqueries seem like they'd perform poorly with MySQL... -chuck -- Kronolith mailing list - Join the hunt: http://horde.org/bounties/#kronolith Frequently Asked Questions: http://horde.org/faq/ To unsubscribe, mail: kronolith-unsubscribe@... |
|
|
Re: performance on large systems> > In my testing so far that's producing the same output as the stock
> > query but it's doing so in ~65ms instead of ~1200ms. Obviously I'd > > have to fit that to match the string substitutions, but other than > > that, would the above work? > You said you're using Postgres, right? If that's fast for postgres > that's great; I haven't tried it yet (I'll try your patch later), but > the subqueries seem like they'd perform poorly with MySQL... Yes, Postgresql only in our shop, mysql avoided whenever possible. :) I really dislike subqueries but they're faster than join's in this case, at least. When dealing with such a huge number of rows, you really don't want to be doing that in a join. The sub-select in this case reduces the amount of comparison (IN ()) to a very minimal set. I think this would out-perform joins on just about any DB once the data set is sufficiently large, but I'm a hack when it comes to this stuff. John -- John Madden Sr UNIX Systems Engineer Ivy Tech Community College of Indiana jmadden@... -- Kronolith mailing list - Join the hunt: http://horde.org/bounties/#kronolith Frequently Asked Questions: http://horde.org/faq/ To unsubscribe, mail: kronolith-unsubscribe@... |
| Free embeddable forum powered by Nabble | Forum Help |