|
View:
New views
3 Messages
—
Rating Filter:
Alert me
|
|
|
Default ORDER BYHi Guys,
This is regarding a recent change to the default order that a model uses. I asked Paul to make that change and he did in the unfounded branch. (r4120) Thanks, Paul. I wanted to move some of the discussion around it to the mailing list so others will understand what its about. The issue was that Solar_Sql_Model was was always setting a default order in the $this->_order property. If you did not set an order, the model would automatically set the order to the primary key. Unfortunately, this make it so there is no way to actually say "no order." Where this comes into play is that when mysql uses indexes to optimize a query, it normally only chooses one index. So, if you have both an ORDER BY and WHERE clause, mysql (at least v5), is forced to choose between optimizing for the ORDER BY or optimizing for the WHERE clause. If it doesn't know that the query will return more than one record, it will optimize for the ORDER BY. This results in cases where it might do an index scan on millions of records to find a handful of records because of the order by. If you leave the order by off, it can use a different index and optimize for WHERE clause. In some cases for us this was resulting in an order of magnitude difference in query speed. So, its generally bad to include an ORDER BY clause if you don't actually need the results to be in order. The changes in r4120 were intended to address this by not specifying a default order for a table, in effect making the default order "no order." I've tested this and it works great. This solution is also way way better than the solution I implemented in the grid branch. Unfortunately, the related object still has a _setOrder which injects an order into any query that includes an eager fetch. Consider an invoice model with the following relationship $this->_hasMany('invoice_items'); If you issue the following query $items = $invoices->fetchAll('eager'=>'items'); This will issue two queries, one to fetch the invoices and another to fetch the items. However the items query will still have a default ORDER BY clause generated. That is because of this definition in Solar_Sql_Model_Related: protected function _setOrder($opts) { if (empty($opts['order'])) { // default to the foreign primary key $this->order = array("{$this->foreign_alias}.{$this- >_foreign_model->primary_col}"); } else { $this->order = (array) $opts['order']; } } Instead, may I suggest the following for Solar_Sql_Model_Related protected function _setOrder($opts) { if (empty($opts['order'])) { // default to the foreign primary key $this->order = array(); } else { $this->order = (array) $opts['order']; } } Which initializes order to a value which is valid but does not add extra ordering to the query fetching the related items. Also consider what happens if the invoice model has a toOne relationship and we set an explicit order $this->_hasOne('billing_address', 'order' => 'street'); If you issue this fetch $items = $invoices->fetchAll('eager'=>'billing_address'); It will generate a single query, using a left join to fetch the addresses, but the ORDER BY will include the order from the relationship, thus ordering the native result set. This doesn't happen with hasMany relationships. What does it really mean to specify an 'order' option on a toOne relationship? There is one record, thus, order can't be important. I'm thinking that for toOne relationships, using the 'order' option is probably at best "undefined." Because of the impact on the parent query, maybe 'order' should be ignored for toOne() or maybe result in an exception if used? I kinda lean toward an exception. You can always specify the 'order' in the fetch() method parameters. What are your thoughts? Best Regards, Jeff _______________________________________________ Solar-talk mailing list Solar-talk@... http://mailman-mail5.webfaction.com/listinfo/solar-talk |
|
|
Re: Default ORDER BYOn Oct 2, 2009, at 13:58 , Jeff Moore wrote:
> Unfortunately, the related object still has a _setOrder which injects > an order into any query that includes an eager fetch. ... > Instead, may I suggest the following for Solar_Sql_Model_Related > > protected function _setOrder($opts) > { > if (empty($opts['order'])) { > // default to the foreign primary key > $this->order = array(); > } else { > $this->order = (array) $opts['order']; > } > } Sounds fine to me -- if you could enter it as a ticket in Trac so I don't forget, that would be helpful. > What does it really mean to specify an 'order' option on a toOne > relationship? There is one record, thus, order can't be important. Sure it can; e.g., if you want the relationship to always return the single most recent 'foo' by date, even when there are many matching related records, the order becomes important. E.g.: // all related foos, in whatever order $this->_hasMany('foos'); // just the most-recent foo $this->_hasOne('latest_foo', array( 'class' => 'Foos', 'order' => 'date DESC', ); The follow-on aspects of doing that are a separate discussion; the point is that yes, order *can* be important in to-one relationships. -- Paul M. Jones http://paul-m-jones.com/ _______________________________________________ Solar-talk mailing list Solar-talk@... http://mailman-mail5.webfaction.com/listinfo/solar-talk |
|
|
Re: Default ORDER BYOn Oct 2, 2009, at 2:29 PM, Paul M Jones wrote: > Sounds fine to me -- if you could enter it as a ticket in Trac so I > don't forget, that would be helpful. Ticketed. (#212) >> What does it really mean to specify an 'order' option on a toOne >> relationship? There is one record, thus, order can't be important. > > Sure it can; e.g., if you want the relationship to always return the > single most recent 'foo' by date, even when there are many matching > related records, the order becomes important. E.g.: > > // all related foos, in whatever order > $this->_hasMany('foos'); > > // just the most-recent foo > $this->_hasOne('latest_foo', array( > 'class' => 'Foos', > 'order' => 'date DESC', > ); > > The follow-on aspects of doing that are a separate discussion; the > point is that yes, order *can* be important in to-one relationships. Paul, In this case latest_foo is defined as a _hasOne but will actually return many results. The query that will be issued if you do this: $invoices->fetchAll(array('eager'=>'latest_foo')); Will be something like SELECT * FROM invoices LEFT JOIN foos ON foos.invoice_id = invoices.id Which will end up returning duplicate invoice records in the top level results if you have multiple foos. Is this what you meant by follow-on aspects? Philosophically, I think that the relationship definition should be stable. What I mean by that is that including it in an eager fetch should not change the native results in a query. And traversing the relationship without eager fetching it should return the the same results as traversing after eager fetching it. I think this stability requirement is necessary to be able to write consistent business logic into the models. I think its OK for the top level fetch() to specify conditions on the eager clause which would violate this principle. However, if that is the case, then objects should not be loaded for that relationship. This would force a second query if the relationship is traversed. I think the join_flag eager option controls this. (?) For example, going back to invoices and items: $result = $invoices->fetchAll(eager('items'=>array('where'=>'status = "active"'))); This would fetch all the invoices with at least one active item. However, the items property of each invoice should contain the invoices described in the relationship (active and all others), not the invoices described in the fetch. I think this stability is critical for maintaining consistency of business logic in the models. I know we've talked about these before, but maybe we should document some of these guidelines so we can preserve them going forward and so people know what to expect. So back to the 'order' option. I think that 'order' makes sense in the context of a fetch option. It also makes sense on a hasMany() relationship. But, I don't see how it can have meaning on a toOne() relationship, at least given the current capabilities of the relationships. I don't think we currently have the capability to define a 'latest_foo' type relationship. There are 11 base relationship options. This doesn't include a bunch of through options. There are 11 eager options with some overlap and 16 fetch options with some overlap. Those are alot of interacting options. Philosophically, I think it would be helpful to throw a helpful exception or two for combinations that don't make much sense. We can always take them out when we support more cases. Best Regards, Jeff _______________________________________________ Solar-talk mailing list Solar-talk@... http://mailman-mail5.webfaction.com/listinfo/solar-talk |
| Free embeddable forum powered by Nabble | Forum Help |