|
View:
New views
6 Messages
—
Rating Filter:
Alert me
|
|
|
Badly performing WHERE caluseHi,
We have a WHERE clause: WHERE 2>1 AND 2>1 AND DateTimestamp >= ? AND DateTimestamp <= ? AND ORDER BY DateTimestamp DESC which reasonably takes 2ms to execute on a table that has 357254 rows, and does so by selecting the most appropriate index: Index Scan ResultSet for TBLEVENTS using index TBLEVENTS_DATETIMESTAMP_DESC at serializable isolation level using share row locking chosen by the optimizer However, when we change the WHERE clause to: WHERE 1=1 AND 1=1 AND DateTimestamp >= ? AND DateTimestamp <= ? AND ORDER BY DateTimestamp DESC the execution time increases to 6000ms, and appears to do so because the query planner hasn't selected the most appropriate index: Index Scan ResultSet for TBLEVENTS using index TBLEVENTS_DATETIMESTAMP_ASC at serializable isolation level using share row locking chosen by the optimizer The indexes are specified as follows: CREATE INDEX tblEventsDateTimestampASC ON tblEvents (DateTimestamp ASC); CREATE INDEX tblEventsDateTimestampDESC ON tblEvents (DateTimestamp DESC); Do people agree that this appears to be a bug with the query planner, as the selection of the most appropriate index should not be influenced by static conditions such as 1=1 or 2>1? Regards, Gavin |
|
|
Re: Badly performing WHERE caluse> Do people agree that this appears to be a bug with the query planner, as the
> selection of the most appropriate index should not be influenced by static > conditions such as 1=1 or 2>1? I don't know of any reason why the 1=1 versus 2>1 should have affected the query plan chosen. If you can provide a reproducible test case that demonstrates it, I'd be very interested in pursuing this further. thanks, bryan |
|
|
Re: Badly performing WHERE caluseOne could argue that in the first case (2>1), the optimizer is free to choose whatever index it wants because the query can provably return no rows. However, in the second case (1=1), were the DateTimestamp condition must be evaluated, it would appear that the optimizer indeed selected the wrong index.
However, I don't think you can point to 2ms vs. 6000ms as proof of a bad query plan. In the first case, the optimizer can prove (to itself) that the query can return no rows (because of the 2>1 condition along with all AND predicates) -- and therefore it probably doesn't do any index scan at all. Therefore, 2ms.
In the second case, because the 1=1 condition is satisfied, Derby must *actually* do the index scan as per plan. Therefore, 6000ms. Whether using the DESC index actually provides significant improvement over ASC remains to be seen, as the comparison currently isn't apples-to-apples.
I would be interested in seeing the result of the DESC vs. the ASC index in the (1=1) scenario. You can override the optimizer with a hint, like so: SELECT * FROM table --DERBY-PROPERTIES index=TBLEVENTS_DATETIMESTAMP_DESC
WHERE 1=1 AND 1=1 AND DateTimestamp >= ? AND DateTimestamp <= ? AND ORDER BY DateTimestamp DESC Note, the --DERBY-PROPERTIES must come at the end of a literal line (i.e. there must be a linefeed after that cause) otherwise everything after that will be taken as a comment.
Let us know the result. Either way it may be considered a bug in the optimizer, but depending on the number of rows returned, may not make a significant difference in performance. -Brett
On Wed, Sep 23, 2009 at 11:18 PM, Gavin Matthews <gjm@...> wrote:
|
|
|
Re: Badly performing WHERE caluseBrett Wooldridge wrote:
> One could argue that in the first case (2>1), the optimizer is free to > choose whatever index it wants because the query can provably return > no rows. However, in the second case (1=1), were the DateTimestamp > condition must be evaluated, it would appear that the optimizer indeed > selected the wrong index. > > Ummm, last time I checked, 2 was greater than 1. :) |
|
|
Re: Badly performing WHERE caluseThat'll teach me to post too early in the morning (I'm in Tokyo) and before coffee. :-)
On Thu, Sep 24, 2009 at 10:19 AM, Matt Doran <matt.doran@...> wrote:
|
|
|
Re: Badly performing WHERE caluseBryan Pendleton <bpendleton@...> writes:
>> Do people agree that this appears to be a bug with the query planner, as the >> selection of the most appropriate index should not be influenced by static >> conditions such as 1=1 or 2>1? > > I don't know of any reason why the 1=1 versus 2>1 should have affected the > query plan chosen. Sorry for the late response. I just came across some code in the engine that made me remember this thread. It looks like Derby doesn't treat 1=1 and 2>1 as boolean constants during optimization, so it takes the selectivity from the = operator and the > operator, respectively. The = operator has selectivity 0.1, and > has selectivity 0.33. This means that the optimizer thinks 1=1 will match 10% of the rows and 2>1 will match 33% of the rows. This is apparently enough to make the optimizer choose another plan in this case. Of course, both of the predicates will match 100% of the rows, so the the information that's given to the optimizer is inaccurate in any case. Logged here: https://issues.apache.org/jira/browse/DERBY-4416 -- Knut Anders |
| Free embeddable forum powered by Nabble | Forum Help |