|
View:
New views
19 Messages
—
Rating Filter:
Alert me
|
|
|
non-relational DBHi everyone,
this rather long mail contains a status report and instructions for contributors and implementation notes for Django core developers. If you only want to know the status you can stop after the first section. If you want to contribute I hope this provides a good starting point into our port. --------------------------------------- Status report We've got pretty far with our App Engine port. For example, the sessions db and cached_db backends both work unmodified on App Engine. You can also order results and use basic filter()s as supported by the low-level App Engine API (gt, gte, lt, lte, exact, pk__in). You can also use QuerySet.order_by(), .delete(), .count(), Model.save(), .delete(). This is our second porting attempt (it's not in the old repository). Our first attempt had too many conflicts with the multi-db branch (esp. the one on github). This time we just hacked everything together. We didn't concentrate on cleaning up the current backend API. We've also disabled SQL support. The next step is to move all the hacks into a nice backend API (at the same time making sure that it won't conflict with multi-db) and re-enable SQL support. That's where we need help. Also, if you want to work on SimpleDB support this is the right time to join. The App Engine backend itself can be handled by Thomas Wanschik and me - contributions in this area are not absolutely necessary, so please concentrate on the cleanup if you want to help. Now to the details (for those who want to contribute). --------------------------------------- Introducing QueryGlue The old Django code was distributed across three layers: * django.db.models.queryset.QuerySet * django.db.models.sql.query.Query (from now on just sql.Query) * backend When a new QuerySet is instantiated (e.g. by calling Model.objects.all()) it asks the backend for its Query class and then creates an instance of that class. By default, this class is sql.Query. Only the Oracle backend has its own Query which subclasses sql.Query. Normally, sql.Query builds the query on-the-fly. Whenever you call QuerySet.filter(<filters>) the filters get put into a Q(<filters>) and passed to sql.Query.add_q( Q(...) ). This function iterates over all filter rules in the Q object and calls sql.Query.add_filter() for each individual filter. This in turn directly modifies sql.Query.where which is a tree structure that represents the WHERE clause. It already contains information about the JOIN type for each filter (INNER, OUTER), the fields that get referenced by the filter, the column and table aliases, and so on. It already does a lot of what we need for non-relational backends, but it's too SQL-specific. The current behavior is also a problem for multi-db because it makes too many assumptions about the storage format of the filter rules. The user could call QuerySet.using(other_connection) anytime, so QuerySet shouldn't really work with the low-level sql.Query class before it actually executes the query. We've solved this problem by introducing a backend-independent query representation between QuerySet and the low-level Query (sql.Query, appengine.Query, etc.). This representation is called QueryGlue. You can find it in django.db.models.queryglue. It provides almost exactly the same "public" API as sql.Query (so it can easily be integrated with QuerySet). Each filter() call gets translated into a tree structure that is inspired by sql.Query.where, but it doesn't contain any information about the kind of JOIN. Instead, it stores high-level important information like whether we're filtering on a primary key, which columns and tables are involved in a JOIN, etc. --------------------------------------- The low-level Query class Once the query needs to be executed (e.g., by calling .count() or by iterating over the query) the QueryGlue instance creates a new low-level Query instance which gets the QueryGlue as its only parameter. Currently, the low-level Query class is hard-coded to GAEQuery/BaseQuery in django.db.models.nonrelational.query. Then, QueryGlue calls the Query's respective execution function (results_iter(), count(), etc.). The constructor only gets the QueryGlue instance. Then, we call the respective execution function (results_iter(), count(), etc.) on the instantiated low-level Query. Our GAEQuery can now iterate over all filters in QueryGlue.filters and convert them to an App Engine Query object. --------------------------------------- subqueries Instead of working with subquery classes we've added delete_bulk(), insert(), etc. directly to QueryGlue and the low-level Query class. If sql.Query really needs the current design those functions can still be routed to the respective subquery instance, but on App Engine it's easier to handle those operations in a separate function. --------------------------------------- The cleanup We made a few not-so-clean changes to Django itself. I've attached a diff, so contributors can easily find all the changes we did to Django (they're also commented with TODO and GAE): ............................ * disabled multi-table inheritance; this could be emulated as described on the Django wiki http://code.djangoproject.com/wiki/NonSqlBackends See django/db/models/base.py: line 147 ............................ * disabled deletion of related objects in Model.delete() and QuerySet.delete() See django/db/models/query.py: lines 1036, 1065 ............................ * replaced sql.subqueries.*Query usage with simple functions on a single Query class (insert_or_update() instead of InsertQuery and UpdateQuery) See django/db/models/query.py: lines 1058, 1088 ............................ * commented out distinction between insert and update in Model.save_base() because there's no such concept in App Engine (and SimpleDB, AFAIK) See django/db/models/base.py: lines 470, 475 ............................ The long-term goal is of course to clean this up and move most of these changes into the backend API. --------------------------------------- Common non-relational features The plan is to add support for simple joins and select_related to all non-relational backends by either subclassing the backend's Query class on-the-fly with a JoinQuery or by supporting something like query pre-processors which can be added above the low-level Query class. We haven't thought about the details, yet, but I hope you get the idea. --------------------------------------- SQL layer details: The ugly detail is that sql.subqueries contains specialized query classes like InsertQuery, DeleteQuery, etc. which subclass the backend's Query class. This means that currently, the module loading process jumps around: * sql/__init__.py imports sql.query and then sql.subqueries * sql.query creates the base Query class * after that, sql.query allows the backend to override the Query class * sql.subqueries creates subclasses which derive from Query In multi-db in SVN this is uglier because the subquery classes don't have just one single sql.Query base class from which to derive, anymore. There can be multiple backends, each with their own sql.Query class, so the subqueries have to be maintained by the backend (with some multi-inheritance magic and manual caching of the custom subclasses). In multi-db on github this is much cleaner: The backends can't override sql.Query, anymore. Instead, there's an SQLCompiler class which can be overridden by the backend to take care of backend-specific details. sql.Query stores a slightly more abstract representation of the query. This multi-db branch moves a lot of code around. That's why we should try to keep as much code as possible where it is (at least until the branch gets merged into trunk). --------------------------------------- The source The test project and our unit tests are here: http://bitbucket.org/wkornewald/django-testapp/ The modified Django source and the backend is here: http://bitbucket.org/wkornewald/django-nonrel-hacked/ We've patched the trunk branch. Unforunately, the branches are unnamed (I converted the git mirror because the hg mirror's branches on bitbucket are broken). You should be able to find the right branch with "hg heads" and "hg up -C" to it. Normally our branch should be at tip, anyway, so you don't need to do anything. When merging you need to find the trunk branch with "hg heads" and "hg merge <revnum>" with the trunk head. If this becomes a huge problem we'll switch to the django-trunk mirror, but I wanted to keep the option to switch to Alex' multidb branch if that's better, so I chose this sub-optimal Django mirroring solution. --------------------------------------- Task management Our tasks are managed in a Google Spreadsheet: https://spreadsheets.google.com/ccc?key=0AnLqunL-SCJJdE1fM0NzY1JQTXJuZGdEa0huODVfRHc&hl=en The task list isn't complete, yet. We're working on that. Bye, Waldemar Kornewald --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@... To unsubscribe from this group, send email to django-developers+unsubscribe@... For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~----------~----~----~----~------~----~------~--~--- diff -r 6e733173d200 django/db/models/base.py --- a/django/db/models/base.py Sat Oct 17 17:32:25 2009 +0000 +++ b/django/db/models/base.py Thu Oct 22 11:28:17 2009 +0200 @@ -143,6 +143,11 @@ (field.name, name, base.__name__)) if not base._meta.abstract: # Concrete classes... + + # TODO: GAE: use polymodel instead + if True: + raise TypeError("Multi-table inheritance isn't yet supported on App Engine") + while base._meta.proxy: # Skip over a proxy class to the "real" base it proxies. base = base._meta.proxy_for_model @@ -462,20 +467,23 @@ # First, try an UPDATE. If that doesn't update anything, do an INSERT. pk_val = self._get_pk_val(meta) pk_set = pk_val is not None - record_exists = True + # TODO: GAE: Clean up. Setting record_exists to False fakes that we don't + # distinguish between insert and update. +# record_exists = True + record_exists = False manager = cls._base_manager - if pk_set: - # Determine whether a record with the primary key already exists. - if (force_update or (not force_insert and - manager.filter(pk=pk_val).extra(select={'a': 1}).values('a').order_by())): - # It does already exist, so do an UPDATE. - if force_update or non_pks: - values = [(f, None, (raw and getattr(self, f.attname) or f.pre_save(self, False))) for f in non_pks] - rows = manager.filter(pk=pk_val)._update(values) - if force_update and not rows: - raise DatabaseError("Forced update did not affect any rows.") - else: - record_exists = False +# if pk_set: +# # Determine whether a record with the primary key already exists. +# if (force_update or (not force_insert and +# manager.filter(pk=pk_val).extra(select={'a': 1}).values('a').order_by())): +# # It does already exist, so do an UPDATE. +# if force_update or non_pks: +# values = [(f, None, (raw and getattr(self, f.attname) or f.pre_save(self, False))) for f in non_pks] +# rows = manager.filter(pk=pk_val)._update(values) +# if force_update and not rows: +# raise DatabaseError("Forced update did not affect any rows.") +# else: +# record_exists = False if not pk_set or not record_exists: if not pk_set: if force_update: @@ -519,6 +527,8 @@ pk_val = self._get_pk_val() if seen_objs.add(self.__class__, pk_val, self, parent, nullable): return + # TODO: GAE support deleting related objects in background task + return for related in self._meta.get_all_related_objects(): rel_opts_name = related.get_accessor_name() diff -r 6e733173d200 django/db/models/query.py --- a/django/db/models/query.py Sat Oct 17 17:32:25 2009 +0000 +++ b/django/db/models/query.py Thu Oct 22 11:28:17 2009 +0200 @@ -14,6 +14,7 @@ from django.db.models.fields import DateField from django.db.models.query_utils import Q, select_related_descend, CollectedObjects, CyclicDependency, deferred_class_factory from django.db.models import signals, sql +from django.db.models.queryglue import QueryGlue # Used to control how many objects are worked with at once in some cases (e.g. @@ -33,7 +34,7 @@ """ def __init__(self, model=None, query=None): self.model = model - self.query = query or sql.Query(self.model, connection) + self.query = query or QueryGlue(self.model, connection) self._result_cache = None self._iter = None self._sticky_filter = False @@ -1032,20 +1033,21 @@ for pk_val, instance in items: signals.pre_delete.send(sender=cls, instance=instance) - pk_list = [pk for pk,instance in items] - del_query = sql.DeleteQuery(cls, connection) - del_query.delete_batch_related(pk_list) - - update_query = sql.UpdateQuery(cls, connection) - for field, model in cls._meta.get_fields_with_model(): - if (field.rel and field.null and field.rel.to in seen_objs and - filter(lambda f: f.column == field.rel.get_related_field().column, - field.rel.to._meta.fields)): - if model: - sql.UpdateQuery(model, connection).clear_related(field, - pk_list) - else: - update_query.clear_related(field, pk_list) + # TODO: GAE: do this in a background task +# pk_list = [pk for pk,instance in items] +# del_query = sql.DeleteQuery(cls, connection) +# del_query.delete_batch_related(pk_list) +# +# update_query = sql.UpdateQuery(cls, connection) +# for field, model in cls._meta.get_fields_with_model(): +# if (field.rel and field.null and field.rel.to in seen_objs and +# filter(lambda f: f.column == field.rel.get_related_field().column, +# field.rel.to._meta.fields)): +# if model: +# sql.UpdateQuery(model, connection).clear_related(field, +# pk_list) +# else: +# update_query.clear_related(field, pk_list) # Now delete the actual data. for cls in ordered_classes: @@ -1053,16 +1055,17 @@ items.reverse() pk_list = [pk for pk,instance in items] - del_query = sql.DeleteQuery(cls, connection) + del_query = QueryGlue(cls, connection) del_query.delete_batch(pk_list) # Last cleanup; set NULLs where there once was a reference to the # object, NULL the primary key of the found objects, and perform # post-notification. for pk_val, instance in items: - for field in cls._meta.fields: - if field.rel and field.null and field.rel.to in seen_objs: - setattr(instance, field.attname, None) + # TODO: GAE: do this in a background task +# for field in cls._meta.fields: +# if field.rel and field.null and field.rel.to in seen_objs: +# setattr(instance, field.attname, None) signals.post_delete.send(sender=cls, instance=instance) setattr(instance, cls._meta.pk.attname, None) @@ -1082,6 +1085,5 @@ the InsertQuery class and is how Model.save() is implemented. It is not part of the public API. """ - query = sql.InsertQuery(model, connection) - query.insert_values(values, raw_values) - return query.execute_sql(return_id) + query = QueryGlue(model, connection) + return query.insert(values, raw_values, return_id) |
|
|
Re: non-relational DBHi again, now a little question: Some fields do type conversions. For example, TimeField converts datetime objects into time objects. App Engine doesn't support time, but only datetime, so should we do such conversions at the backend level or should we expect the field to handle it (esp. if it already has such conversion code)? What's the status of the email backends ticket? There hasn't been any reply to Andi Albrecht's latest patch and comment. http://code.djangoproject.com/ticket/10355 This is essential for supporting all kinds of cloud platforms. Bye, Waldemar Kornewald --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@... To unsubscribe from this group, send email to django-developers+unsubscribe@... For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: non-relational DBOn Thu, Oct 22, 2009 at 7:46 PM, Waldemar Kornewald <wkornewald@...> wrote: > > Hi again, > now a little question: > > Some fields do type conversions. For example, TimeField converts > datetime objects into time objects. > App Engine doesn't support time, but only datetime, so should we do > such conversions at the backend level or should we expect the field to > handle it (esp. if it already has such conversion code)? I'm unsure what problem you're having here. The backend needs to return a type that the TimeField can turn into a Python Time object. TimeField is fairly liberal in what it will accept - DateTime objects, Time objects, and strings that express a time will all be handled. As long as your backend returns one of these acceptable types, you're done. > What's the status of the email backends ticket? There hasn't been any > reply to Andi Albrecht's latest patch and comment. > http://code.djangoproject.com/ticket/10355 > This is essential for supporting all kinds of cloud platforms. We're in the process of doing feature voting for v1.2. Personally, I'm happy with the state of the patch, but there have been a couple of -1 votes for the patch, which means that some people still need to be convinced that it's the right thing to do. Once voting is finished, we may need to revisit this issue on django-dev. Yours, Russ Magee %-) --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@... To unsubscribe from this group, send email to django-developers+unsubscribe@... For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: non-relational DBOn Thu, Oct 22, 2009 at 2:07 PM, Russell Keith-Magee <freakboy3742@...> wrote: > > On Thu, Oct 22, 2009 at 7:46 PM, Waldemar Kornewald > <wkornewald@...> wrote: >> >> Hi again, >> now a little question: >> >> Some fields do type conversions. For example, TimeField converts >> datetime objects into time objects. >> App Engine doesn't support time, but only datetime, so should we do >> such conversions at the backend level or should we expect the field to >> handle it (esp. if it already has such conversion code)? > > I'm unsure what problem you're having here. The backend needs to > return a type that the TimeField can turn into a Python Time object. > TimeField is fairly liberal in what it will accept - DateTime objects, > Time objects, and strings that express a time will all be handled. > > As long as your backend returns one of these acceptable types, you're done. Great. I just wasn't sure if this was just an internal implementation detail which we better shouldn't rely on in our backends. Bye, Waldemar Kornewald --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@... To unsubscribe from this group, send email to django-developers+unsubscribe@... For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: non-relational DBOn Thu, Oct 22, 2009 at 2:07 PM, Russell Keith-Magee <freakboy3742@...> wrote: > > On Thu, Oct 22, 2009 at 7:46 PM, Waldemar Kornewald > <wkornewald@...> wrote: >> >> Hi again, >> now a little question: >> >> Some fields do type conversions. For example, TimeField converts >> datetime objects into time objects. >> App Engine doesn't support time, but only datetime, so should we do >> such conversions at the backend level or should we expect the field to >> handle it (esp. if it already has such conversion code)? > > I'm unsure what problem you're having here. The backend needs to > return a type that the TimeField can turn into a Python Time object. > TimeField is fairly liberal in what it will accept - DateTime objects, > Time objects, and strings that express a time will all be handled. > > As long as your backend returns one of these acceptable types, you're done. > >> What's the status of the email backends ticket? There hasn't been any >> reply to Andi Albrecht's latest patch and comment. >> http://code.djangoproject.com/ticket/10355 >> This is essential for supporting all kinds of cloud platforms. > > We're in the process of doing feature voting for v1.2. Personally, I'm > happy with the state of the patch, but there have been a couple of -1 > votes for the patch, which means that some people still need to be > convinced that it's the right thing to do. Once voting is finished, we > may need to revisit this issue on django-dev. To give a short feedback. I'm still there and I've read most of the comments given in the voting sheet. I'd happy to address the concerns once voting is finished - in a different thread, of course :) > > Yours, > Russ Magee %-) > > > > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@... To unsubscribe from this group, send email to django-developers+unsubscribe@... For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: non-relational DBhttp://code.google.com/p/live-android/ On 22 Okt., 13:35, Waldemar Kornewald <wkornew...@...> wrote: > Hi everyone, > this rather long mail contains a status report and instructions for > contributors and implementation notes for Django core developers. If > you only want to know the status you can stop after the first section. > If you want to contribute I hope this provides a good starting point > into our port. > > --------------------------------------- > Status report > > We've got pretty far with our App Engine port. For example, the > sessions db and cached_db backends both work unmodified on App Engine. > You can also order results and use basic filter()s as supported by the > low-level App Engine API (gt, gte, lt, lte, exact, pk__in). You can > also use QuerySet.order_by(), .delete(), .count(), Model.save(), > .delete(). > > This is our second porting attempt (it's not in the old repository). > Our first attempt had too many conflicts with the multi-db branch > (esp. the one on github). This time we just hacked everything > together. We didn't concentrate on cleaning up the current backend > API. We've also disabled SQL support. > > The next step is to move all the hacks into a nice backend API (at the > same time making sure that it won't conflict with multi-db) and > re-enable SQL support. That's where we need help. Also, if you want to > work on SimpleDB support this is the right time to join. The App > Engine backend itself can be handled by Thomas Wanschik and me - > contributions in this area are not absolutely necessary, so please > concentrate on the cleanup if you want to help. > > Now to the details (for those who want to contribute). > > --------------------------------------- > Introducing QueryGlue > > The old Django code was distributed across three layers: > * django.db.models.queryset.QuerySet > * django.db.models.sql.query.Query (from now on just sql.Query) > * backend > > When a new QuerySet is instantiated (e.g. by calling > Model.objects.all()) it asks the backend for its Query class and then > creates an instance of that class. By default, this class is > sql.Query. Only the Oracle backend has its own Query which subclasses > sql.Query. I think Waldermar wanted to say: 'Only the Oracle backend has its own Query which subclasses sql.query.BaseQuery.' instead of 'sql.Query'. We added a module called 'nonrelational' to django.db.models. There we define query.py which itself defines a BaseQuery class (the equivalent to sql.query.BaseQuery). Just like Oracle subclasses sql.query.BaseQuery we could use the same mechanism and subclass nonrelational.BaseQuery in order to get the GAEQuery. For now this subclassing mechanism is called in the module sql.query (not sql.Query from above) by connection.ops.query_class(BaseQuery) and uses sql.query.BaseQuery as the BaseQuery. We need some way of telling which class should be the BaseQuery. This could be done via the settings, something similar to settings.DATABASE_TYPE = 'nonrelational' and for sql it would be settings.DATABASE_TYPE= 'sql'. According to this the right BaseQuery could be chosen. Additionally these mechanisme shouldn't be called in sql.query. But that's only a proposal of making the port a little bit more clean. I don't know if there are any conflicts with multi-db this way. What do you guys think of it? > > Normally, sql.Query builds the query on-the-fly. Whenever you call > QuerySet.filter(<filters>) the filters get put into a > Q(<filters>) and passed to > sql.Query.add_q( Q(...) ). > This function iterates over all filter rules in the Q object and calls > sql.Query.add_filter() for each individual filter. > This in turn directly modifies sql.Query.where which is a tree > structure that represents the WHERE clause. It already contains > information about the JOIN type for each filter (INNER, OUTER), the > fields that get referenced by the filter, the column and table > aliases, and so on. It already does a lot of what we need for > non-relational backends, but it's too SQL-specific. > > The current behavior is also a problem for multi-db because it makes > too many assumptions about the storage format of the filter rules. The > user could call QuerySet.using(other_connection) anytime, so QuerySet > shouldn't really work with the low-level sql.Query class before it > actually executes the query. > > We've solved this problem by introducing a backend-independent query > representation between QuerySet and the low-level Query (sql.Query, > appengine.Query, etc.). This representation is called QueryGlue. You > can find it in django.db.models.queryglue. It > provides almost exactly the same "public" API as sql.Query (so it can > easily be integrated with QuerySet). Each filter() call gets > translated into a tree structure that is inspired by sql.Query.where, > but it doesn't contain any information about the kind of JOIN. > Instead, it stores high-level important information like whether we're > filtering on a primary key, which columns and tables are involved in a > JOIN, etc. > > --------------------------------------- > The low-level Query class > > Once the query needs to be executed (e.g., by calling .count() or by > iterating over the query) the QueryGlue instance creates a new > low-level Query instance which gets the QueryGlue as its only > parameter. Currently, the low-level Query class is hard-coded to > GAEQuery/BaseQuery in django.db.models.nonrelational.query. > > Then, QueryGlue calls the Query's respective execution function > (results_iter(), count(), etc.). The > constructor only gets the QueryGlue instance. Then, we call the > respective execution function (results_iter(), count(), etc.) on the > instantiated low-level Query. Our GAEQuery can now iterate over all > filters in QueryGlue.filters and convert them to an App Engine Query > object. > QuerySet.query, QuerySet.query should be an instance of the actual Query class (see above for a proposal of a loading mechanism for the Query class) and the actual QueryGlue instance should be passed to QuerySet.query somehow. QuerySet's methods will update the QueryGlue instance only. This mechanism could be used for the existing sql backends too (QueryGlue has not to be used for this but at least some high-level information tree. Much code of QueryGlue could be reused for this tree but has to be extended too). Of course that will result in changes to the existing backends but would result in a flexible way to write backends by letting backends traverse the tree and form the actual query for the specified database as soon as the query is executed). > --------------------------------------- > subqueries > > Instead of working with subquery classes we've added delete_bulk(), > insert(), etc. directly to QueryGlue and the low-level Query class. If > sql.Query really needs the current design those functions can still be > routed to the respective subquery instance, but on App Engine it's > easier to handle those operations in a separate function. > > --------------------------------------- > The cleanup > > We made a few not-so-clean changes to Django itself. I've attached a > diff, so contributors can easily find all the changes we did to Django > (they're also commented with TODO and GAE): > > ............................ > * disabled multi-table inheritance; > this could be emulated as described on the Django wikihttp://code.djangoproject.com/wiki/NonSqlBackends > > See > django/db/models/base.py: line 147 > > ............................ > * disabled deletion of related objects in Model.delete() and QuerySet.delete() > > See > django/db/models/query.py: lines 1036, 1065 > > ............................ > * replaced sql.subqueries.*Query usage with simple functions on a > single Query class (insert_or_update() instead of InsertQuery and > UpdateQuery) > > See > django/db/models/query.py: lines 1058, 1088 > > ............................ > * commented out distinction between insert and update in > Model.save_base() because there's no such concept in App Engine (and > SimpleDB, AFAIK) > > See > django/db/models/base.py: lines 470, 475 > > ............................ > The long-term goal is of course to clean this up and move most of > these changes into the backend API. > Looking at the diff you can see that we really made small changes to django and in the majority of cases we simply commented some django code out (like the deletion of related objects). So moving these parts into the backend shouldn't be hard and would enable users to write nonrelational databases backends for django in a clean way without manipulating much existent code of django (existing parts only would have to be moved into the sql database backend). > --------------------------------------- > Common non-relational features > > The plan is to add support for simple joins and select_related to all > non-relational backends by > either subclassing the backend's Query class on-the-fly with a > JoinQuery or by supporting something like query pre-processors which > can be added above the low-level Query class. We haven't thought about > the details, yet, but I hope you get the idea. > > --------------------------------------- > SQL layer details: > > The ugly detail is that sql.subqueries contains specialized query > classes like InsertQuery, DeleteQuery, etc. which subclass the > backend's Query class. This means that currently, the module loading > process jumps around: > * sql/__init__.py imports sql.query and then sql.subqueries > * sql.query creates the base Query class > * after that, sql.query allows the backend to override the Query class > * sql.subqueries creates subclasses which derive from Query > > In multi-db in SVN this is uglier because the subquery classes don't > have just one single sql.Query base class from which to derive, > anymore. There can be multiple backends, each with their own sql.Query > class, so the subqueries have to be maintained by the backend (with > some multi-inheritance magic and manual caching of the custom > subclasses). > > In multi-db on github this is much cleaner: The backends can't > override sql.Query, anymore. Instead, there's an SQLCompiler class > which can be overridden by the backend to take care of > backend-specific details. sql.Query stores a slightly more abstract > representation of the query. This multi-db branch moves a lot of code > around. That's why we should try to keep as much code as possible > where it is (at least until the branch gets merged into trunk). > > --------------------------------------- > The source > > The test project and our unit tests are here:http://bitbucket.org/wkornewald/django-testapp/ > > The modified Django source and the backend is here:http://bitbucket.org/wkornewald/django-nonrel-hacked/ > > We've patched the trunk branch. Unforunately, the branches are > unnamed (I converted the git mirror because the hg mirror's branches > on bitbucket are broken). You should be able to find the right branch > with "hg heads" > and "hg up -C" to it. Normally our branch should be at tip, anyway, so > you don't need to do anything. > > When merging you need to find the trunk branch with "hg heads" and "hg > merge <revnum>" with the trunk head. If this becomes a huge problem > we'll switch to the django-trunk mirror, but I wanted to keep the > option to switch to Alex' multidb branch if that's better, so I chose > this sub-optimal Django mirroring solution. > > --------------------------------------- > Task management > > Our tasks are managed in a Google Spreadsheet:https://spreadsheets.google.com/ccc?key=0AnLqunL-SCJJdE1fM0NzY1JQTXJu... > > The task list isn't complete, yet. We're working on that. > > Bye, > Waldemar Kornewald > > django.diff > 9KAnzeigenHerunterladen I hope this will give contributers an idea of where to start. I will add the solution ideas to the spreadsheet as soon as i find some time. It would be nice to hear of some thoughts from django-developers too :) Bye Thomas Wanschik --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@... To unsubscribe from this group, send email to django-developers+unsubscribe@... For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: non-relational DBOn Fri, Oct 23, 2009 at 5:09 AM, Thomas Wanschik <twanschik@...> wrote: > >> When a new QuerySet is instantiated (e.g. by calling >> Model.objects.all()) it asks the backend for its Query class and then >> creates an instance of that class. By default, this class is >> sql.Query. Only the Oracle backend has its own Query which subclasses >> sql.Query. > > I think Waldermar wanted to say: 'Only the Oracle backend has its own > Query which subclasses sql.query.BaseQuery.' instead of 'sql.Query'. I should point out that this is one of the specific problems Alex and I are trying to address in the multi-db refactor. When we've finished, returning the right query class should be as simple as implementing an API on the backend. Yours, Russ Magee %-) --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@... To unsubscribe from this group, send email to django-developers+unsubscribe@... For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: non-relational DBI just want to remind contributers to fill in the cell "Assigned to" and "Status" in the task spreadsheet while working on a specific task in order to prefend problems. Here is the link: https://spreadsheets.google.com/ccc?key=0AnLqunL-SCJJdE1fM0NzY1JQTXJuZGdEa0huODVfRHc&hl=en Bye, Thomas Wanschik --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@... To unsubscribe from this group, send email to django-developers+unsubscribe@... For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: non-relational DBOn 22 Okt., 23:52, Russell Keith-Magee <freakboy3...@...> wrote: > On Fri, Oct 23, 2009 at 5:09 AM, Thomas Wanschik > > <twansc...@...> wrote: > > >> When a new QuerySet is instantiated (e.g. by calling > >> Model.objects.all()) it asks the backend for its Query class and then > >> creates an instance of that class. By default, this class is > >> sql.Query. Only the Oracle backend has its own Query which subclasses > >> sql.Query. > > > I think Waldermar wanted to say: 'Only the Oracle backend has its own > > Query which subclasses sql.query.BaseQuery.' instead of 'sql.Query'. > > I should point out that this is one of the specific problems Alex and > I are trying to address in the multi-db refactor. When we've finished, > returning the right query class should be as simple as implementing an > API on the backend. > Thanks for your answer Russell. But i have one question left. Should we make the effort and clean the app engine backend up in the way the oracle backend is done (using a query_class) or should we wait for the multi-db refactor and then clean up our code according to multi-db? Will it be easier to merge the backend into django then? Bye, Thomas Wanschik > Yours, > Russ Magee %-) --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@... To unsubscribe from this group, send email to django-developers+unsubscribe@... For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: non-relational DBOn Mon, Oct 26, 2009 at 12:42 AM, Thomas Wanschik <twanschik@...> wrote: > > > > On 22 Okt., 23:52, Russell Keith-Magee <freakboy3...@...> wrote: >> On Fri, Oct 23, 2009 at 5:09 AM, Thomas Wanschik >> >> <twansc...@...> wrote: >> >> >> When a new QuerySet is instantiated (e.g. by calling >> >> Model.objects.all()) it asks the backend for its Query class and then >> >> creates an instance of that class. By default, this class is >> >> sql.Query. Only the Oracle backend has its own Query which subclasses >> >> sql.Query. >> >> > I think Waldermar wanted to say: 'Only the Oracle backend has its own >> > Query which subclasses sql.query.BaseQuery.' instead of 'sql.Query'. >> >> I should point out that this is one of the specific problems Alex and >> I are trying to address in the multi-db refactor. When we've finished, >> returning the right query class should be as simple as implementing an >> API on the backend. >> > > Thanks for your answer Russell. But i have one question left. Should > we make the effort and clean the app engine backend up in the way the > oracle backend is done (using a query_class) or should we wait for the > multi-db refactor and then clean up our code according to multi-db? > Will it be easier to merge the backend into django then? The current query_class will need to change slightly to support multi-db, so anything you implement against that interface will require some rework later on. That said, the fundamental approach (i.e., the backend tells you what class to use for queries) will still be there - it will just be used in a slightly different way. If you want to write (and test) code now, my suggestion would be to try making your code as clean as possible against the current interface, with the expectation that there will be some rework once multi-db lands. The corollary to this is that if you find yourself needing to make weird and widespread engineering decisions in order to support the query_class approach, you should stop and wait for multi-db to land. Yours Russ Magee %-) --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@... To unsubscribe from this group, send email to django-developers+unsubscribe@... For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: non-relational DBOn Mon, Oct 26, 2009 at 1:46 AM, Russell Keith-Magee <freakboy3742@...> wrote: > The current query_class will need to change slightly to support > multi-db, so anything you implement against that interface will > require some rework later on. That said, the fundamental approach > (i.e., the backend tells you what class to use for queries) will still > be there - it will just be used in a slightly different way. In the SVN multi-db branch there is a modified query_class() API. OTOH, on github it got replaced with SQLCompiler. Are the query_class() changes already committed somewhere? Why do you still need query_class() if you already have SQLCompiler? If this is just about making non-SQL backends work then you'll need some kind of backend-independent query representation, so QuerySet.using() can be supported. That's exactly what we've already done with QueryGlue, so maybe you should better reuse what we've started and finish that together with us, so we all don't waste time on refactoring everything twice? Bye, Waldemar Kornewald --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@... To unsubscribe from this group, send email to django-developers+unsubscribe@... For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: non-relational DBOn Mon, Oct 26, 2009 at 4:52 PM, Waldemar Kornewald <wkornewald@...> wrote: > > On Mon, Oct 26, 2009 at 1:46 AM, Russell Keith-Magee > <freakboy3742@...> wrote: >> The current query_class will need to change slightly to support >> multi-db, so anything you implement against that interface will >> require some rework later on. That said, the fundamental approach >> (i.e., the backend tells you what class to use for queries) will still >> be there - it will just be used in a slightly different way. > > In the SVN multi-db branch there is a modified query_class() API. > OTOH, on github it got replaced with SQLCompiler. Are the > query_class() changes already committed somewhere? No, they haven't been developed yet. Alex and I did the initial design work at the DjangoCon sprints, but we haven't actually implemented anything yet. > Why do you still need query_class() if you already have SQLCompiler? > If this is just about making non-SQL backends work then you'll need > some kind of backend-independent query representation, so > QuerySet.using() can be supported. That's exactly what we've already > done with QueryGlue, so maybe you should better reuse what we've > started and finish that together with us, so we all don't waste time > on refactoring everything twice? There are two different agents at work here. We need to split sql.Query from QueryCompiler to support the fact that the same SQL-like query needs to be rendered in different ways by different backends. This can be as simple as the character used for quoting, or as complex as wrapper clauses needed to handle LIMIT and OFFSET. There is a separate issue of determining if sql.Query is the right internal structure to use for representing a query. To date, sql.Query is the right structure for all Django's supported backends. It might even be the right structure for a non-SQL backend that provides a SQL-like query layer (AppEngine possibly falls into this category, as might a SimpleDB backend). However, a CouchDB, Cassandra or MongoDB backend probably won't get much traction using an internal query structure that talks about Joins and Where clauses. So - the intention is to repurpose query_class() slightly. Once refactored, query_class() will be required to return a class that implements the Query interface. sql.Query is the only example at present, but other backends can provide other internal representations. The call to query_class() will be made in QuerySet - not as part of the sql.Query construction. In this way, query_class() becomes the "get me the actual implementation" method on the backend. We're *not* trying to build a completely generic internal query representation. I'm not convinced that such an animal is even possible in the general case - again, JOIN means something to relational databases, but doesn't mean much to non-SQL databases. If AppEngine is able to leverage some of the sql.Query internals, thats great - but I don't expect that this will be the default situation. Yours, Russ Magee %-) --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@... To unsubscribe from this group, send email to django-developers+unsubscribe@... For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: non-relational DBOn Mon, Oct 26, 2009 at 1:12 PM, Russell Keith-Magee <freakboy3742@...> wrote: > To date, sql.Query is the right structure for all Django's supported > backends. It might even be the right structure for a non-SQL backend > that provides a SQL-like query layer (AppEngine possibly falls into > this category, as might a SimpleDB backend). However, a CouchDB, > Cassandra or MongoDB backend probably won't get much traction using an > internal query structure that talks about Joins and Where clauses. App Engine's datastore API is more similar to MongoDB than SQL. Even on SimpleDB I don't think that the Where tree is a good idea because it's way too SQL-specific. > So - the intention is to repurpose query_class() slightly. Once > refactored, query_class() will be required to return a class that > implements the Query interface. sql.Query is the only example at > present, but other backends can provide other internal > representations. The call to query_class() will be made in QuerySet - > not as part of the sql.Query construction. In this way, query_class() > becomes the "get me the actual implementation" method on the backend. Why do you want to implement this in multi-db if it's only useful for non-SQL support? Shouldn't you better keep multi-db as-is and add the query_class() feature to our branch? That would save us lots of conflicts because won't have to implement our code twice (once for the old query_class and once for your version) and we'll probably have to change your query_class, anyway. > We're *not* trying to build a completely generic internal query > representation. I'm not convinced that such an animal is even possible > in the general case - again, JOIN means something to relational > databases, but doesn't mean much to non-SQL databases. If AppEngine is > able to leverage some of the sql.Query internals, thats great - but I > don't expect that this will be the default situation. Does this mean you'll remove QuerySet.using()? Otherwise you'd have to transform an sql.Query to an appengine.Query. If the generic query representation is not much more detailed than Q objects then I don't see a big problem, anyway (our QueryGlue can be easily transformed into sql.Query or any other query type exactly for that reason). The point why we need QueryGlue is that the queries will have to be manipulated and interpreted in order to emulate certain features (e.g., joins) and its much easier to do this on the final query tree than on its intermediate states. Bye, Waldemar Kornewald --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@... To unsubscribe from this group, send email to django-developers+unsubscribe@... For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: non-relational DBOn Mon, Oct 26, 2009 at 8:46 PM, Waldemar Kornewald <wkornewald@...> wrote: > > On Mon, Oct 26, 2009 at 1:12 PM, Russell Keith-Magee > <freakboy3742@...> wrote: >> To date, sql.Query is the right structure for all Django's supported >> backends. It might even be the right structure for a non-SQL backend >> that provides a SQL-like query layer (AppEngine possibly falls into >> this category, as might a SimpleDB backend). However, a CouchDB, >> Cassandra or MongoDB backend probably won't get much traction using an >> internal query structure that talks about Joins and Where clauses. > > App Engine's datastore API is more similar to MongoDB than SQL. Even > on SimpleDB I don't think that the Where tree is a good idea because > it's way too SQL-specific. Exactly my point. There is no such thing as a "generic" internal query. The closest we can hope for is a common interface for objects that can have Qs, filters, et all added to them. sql.Query interprets those Q's and filters as joins. Other backends will require other interpretations. >> So - the intention is to repurpose query_class() slightly. Once >> refactored, query_class() will be required to return a class that >> implements the Query interface. sql.Query is the only example at >> present, but other backends can provide other internal >> representations. The call to query_class() will be made in QuerySet - >> not as part of the sql.Query construction. In this way, query_class() >> becomes the "get me the actual implementation" method on the backend. > > Why do you want to implement this in multi-db if it's only useful for > non-SQL support? Shouldn't you better keep multi-db as-is and add the > query_class() feature to our branch? That would save us lots of > conflicts because won't have to implement our code twice (once for the > old query_class and once for your version) and we'll probably have to > change your query_class, anyway. Because the way query_class() is currently used causes other problems. Providing an entry point for multi-db is a bonus. >> We're *not* trying to build a completely generic internal query >> representation. I'm not convinced that such an animal is even possible >> in the general case - again, JOIN means something to relational >> databases, but doesn't mean much to non-SQL databases. If AppEngine is >> able to leverage some of the sql.Query internals, thats great - but I >> don't expect that this will be the default situation. > > Does this mean you'll remove QuerySet.using()? Otherwise you'd have to > transform an sql.Query to an appengine.Query. QuerySet.using() will continue to exist. However, I expect there will be some restrictions on when you can call it. Retasking across backend types will be one of those restrictions. > If the generic query representation is not much more detailed than Q > objects then I don't see a big problem, anyway (our QueryGlue can be > easily transformed into sql.Query or any other query type exactly for > that reason). The point why we need QueryGlue is that the queries will > have to be manipulated and interpreted in order to emulate certain > features (e.g., joins) and its much easier to do this on the final > query tree than on its intermediate states. I need to take a closer look at QueryGlue to be able to offer any deeper critique of this. I'll put this on my todo list. Yours, Russ Magee %-) --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@... To unsubscribe from this group, send email to django-developers+unsubscribe@... For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: non-relational DBOn Mon, Oct 26, 2009 at 3:05 PM, Russell Keith-Magee <freakboy3742@...> wrote: > > On Mon, Oct 26, 2009 at 8:46 PM, Waldemar Kornewald > <wkornewald@...> wrote: >> >> On Mon, Oct 26, 2009 at 1:12 PM, Russell Keith-Magee >> <freakboy3742@...> wrote: >>> To date, sql.Query is the right structure for all Django's supported >>> backends. It might even be the right structure for a non-SQL backend >>> that provides a SQL-like query layer (AppEngine possibly falls into >>> this category, as might a SimpleDB backend). However, a CouchDB, >>> Cassandra or MongoDB backend probably won't get much traction using an >>> internal query structure that talks about Joins and Where clauses. >> >> App Engine's datastore API is more similar to MongoDB than SQL. Even >> on SimpleDB I don't think that the Where tree is a good idea because >> it's way too SQL-specific. > > Exactly my point. There is no such thing as a "generic" internal > query. The closest we can hope for is a common interface for objects > that can have Qs, filters, et all added to them. sql.Query interprets > those Q's and filters as joins. Other backends will require other > interpretations. > [...] > I need to take a closer look at QueryGlue to be able to offer any > deeper critique of this. I'll put this on my todo list. Yes, that'll help in our discussions and I hope it'll make clearer why query_class() should rather be implemented in our branch instead of multi-db (which already works the way it is - withour query_class()). Here's the link: http://bitbucket.org/wkornewald/django-nonrel-hacked/src/tip/django/db/models/queryglue.py What QueryGlue does is something like this (though, it's simplified): queryset.filter(bla__attr=3) => gets translated to => queryglue.filters_tree.add(( ['bla', 'attr'], 'exact', 3 )) As you can see, there isn't anything backend-specific in the filters_tree. It's actually not even that much different from what sql.Query.add_filter() already does - just without adding information about joins and other SQL-specific stuff. Now, an SQL backend can just iterate over filters_tree and call sql.Query.add_filter() for each child in the tree - this would be the easiest way to make sql.Query work again in our code. OTOH, the non-relational backends could inspect the tree and possibly execute multiple queries - one for each table involved in the query - and then join the result set in memory (depending on the query and your data this can be inefficient - or efficient). Bye, Waldemar Kornewald --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@... To unsubscribe from this group, send email to django-developers+unsubscribe@... For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: non-relational DBHi, Russell and Alex, did you already look at QueryGlue? We really need to discuss which branch the new query_class() should be in. Bye, Waldemar Kornewald --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@... To unsubscribe from this group, send email to django-developers+unsubscribe@... For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: non-relational DBOn Thu, Oct 29, 2009 at 2:44 PM, Waldemar Kornewald <wkornewald@...> wrote: > > Hi, > Russell and Alex, did you already look at QueryGlue? We really need to > discuss which branch the new query_class() should be in. > > Bye, > Waldemar Kornewald > > > > I haven't had a chance to look at it, and I probably won't until at least a few of the items on my plate are dealt with. That being said I am extremely leery about investing time in something with names like "QueryGlue" as to me they imply a lack of organization in the code, and that may of may not be true, but giving things name thats are at least somewhat explanatory to outside users really helps. Alex -- "I disapprove of what you say, but I will defend to the death your right to say it." -- Voltaire "The people's good is the highest law." -- Cicero "Code can always be simpler than you think, but never as simple as you want" -- Me --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@... To unsubscribe from this group, send email to django-developers+unsubscribe@... For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: non-relational DBOn Thu, Oct 29, 2009 at 9:51 PM, Alex Gaynor <alex.gaynor@...> wrote: > I haven't had a chance to look at it, and I probably won't until at > least a few of the items on my plate are dealt with. That being said > I am extremely leery about investing time in something with names like > "QueryGlue" as to me they imply a lack of organization in the code, > and that may of may not be true, but giving things name thats are at > least somewhat explanatory to outside users really helps. I've renamed it to QueryData. With that huge roadblock out of our way, I hope you're much more likely to help. ;) Bye, Waldemar Kornewald --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@... To unsubscribe from this group, send email to django-developers+unsubscribe@... For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: non-relational DBHey,
a little status update: We've switched our work to Alex' github multi-db branch because we depend on that to make a clean non-relational backend API. Otherwise we'd have to rewrite too much code once multi-db gets merged into trunk. The new branch is at: http://bitbucket.org/wkornewald/django-nonrel-multidb/ Our django-testapp project finally contains unit tests for all supported DB features (Model.save(), QuerySet.get(), .count(), .filter(), .exclude(), etc.). Now we can implement a query_class() backend API and begin to move the hacked-in App Engine code out of Django. Bye, Waldemar Kornewald -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@.... To unsubscribe from this group, send email to django-developers+unsubscribe@.... For more options, visit this group at http://groups.google.com/group/django-developers?hl=. |
| Free embeddable forum powered by Nabble | Forum Help |