|
View:
New views
11 Messages
—
Rating Filter:
Alert me
|
|
|
LDB: Patch for fixing counter variablesMost of our LDB objects are stored in structures containing a dynamic array and a counter. The counter variable is typically an "unsigned integer". Until here it's all okay. But with the major part of the counter variables (i, j...) it's not: since they're "signed". Well, this doesn't give problems till we don't break the ~ 2,1 * 10^9 barrier - but the API is fine to go beyond: with the possible result of infinite loops (since we are never able to reach the maximum).
Now you can argue: "this does happen very rarely" - where I'm also with you. But that doesn't mean that those counters aren't buggy. Since the LDB is a public library for LDAP-like directories I would be very pleased to see this fixed. To achieve this I prepared a patch which changes "signed" types to "unsigned" ones which fix the problem. It passes "make test" and doesn't seem to give problems. The only thing I would like to ask you is to review it, so I could push it soon. Matthias |
|
|
Re: LDB: Patch for fixing counter variablesOn Fri, Nov 06, 2009 at 09:19:46PM +0000, Matthias Dieter Wallnöfer wrote:
> Most of our LDB objects are stored in structures > containing a dynamic array and a counter. The counter A quick side-remark for this: I've recently figured out that for most arrays the explicit counter is superfluous, as we now have talloc_array_length(). Volker |
|
|
Re: LDB: Patch for fixing counter variablesOn Fri, 2009-11-06 at 21:19 +0000, Matthias Dieter Wallnöfer wrote:
> Most of our LDB objects are stored in structures containing a dynamic > array and a counter. The counter variable is typically an "unsigned > integer". Until here it's all okay. But with the major part of the > counter variables (i, j...) it's not: since they're "signed". Well, > this doesn't give problems till we don't break the ~ 2,1 * 10^9 > barrier - but the API is fine to go beyond: with the possible result > of infinite loops (since we are never able to reach the maximum). > Now you can argue: "this does happen very rarely" - where I'm also > with you. But that doesn't mean that those counters aren't buggy. > Since the LDB is a public library for LDAP-like directories I would > be very pleased to see this fixed. > To achieve this I prepared a patch which changes "signed" types to > "unsigned" ones which fix the problem. It passes "make test" and > doesn't seem to give problems. The only thing I would like to ask you > is to review it, so I could push it soon. As you may have come to expect, my first request is that you split the patch up. There are parts of this that look safe, and many parts that need to looked at more carefully. The changes the the Python look most dubious, as you are changing the type of something that is then interpreted as an integer in the python. (so please remove all the Python changes). You also have typo fixes and changes to use LDB_SUCCESS mixed in the same patch. Once you do that, I'll take another look, and we can see if some of these can be easily merged. Perhaps also produce the patches with more context? It's hard to tell from the changed declaration what the effect would be (otherwise I'll have to do the review in gitk). Thanks, Andrew Bartlett -- Andrew Bartlett http://samba.org/~abartlet/ Authentication Developer, Samba Team http://samba.org Samba Developer, Cisco Inc. |
|
|
|
|
|
Re: LDB: Patch for fixing counter variablesHi matthias,
On Sat, 2009-11-07 at 09:24 +0000, Matthias Dieter Wallnöfer wrote: > I don't have much time to chat now, so I submit you a new version of the patch: divided in C and python bindings. The Python binding changes are pointless. PyList_SetItem() takes an integer (not an unsigned integer), so changing this doesn't fix anything. If we really suspect we're going to run into ldb results or messages with more than billions of elements, we should just raise an exception if they don't fit in a Python list. Cheers, Jelmer > --- Andrew Bartlett <abartlet@...> schrieb am Sa, 7.11.2009: > > Von: Andrew Bartlett <abartlet@...> > Betreff: Re: LDB: Patch for fixing counter variables > An: mdw@... > CC: idra@..., "Samba Technical Mailinglist" <samba-technical@...> > Datum: Samstag, 7. November 2009, 1:37 > > On Fri, 2009-11-06 at 21:19 +0000, Matthias Dieter Wallnöfer wrote: > > Most of our LDB objects are stored in structures containing a dynamic > > array and a counter. The counter variable is typically an "unsigned > > integer". Until here it's all okay. But with the major part of the > > counter variables (i, j...) it's not: since they're "signed". Well, > > this doesn't give problems till we don't break the ~ 2,1 * 10^9 > > barrier - but the API is fine to go beyond: with the possible result > > of infinite loops (since we are never able to reach the maximum). > > Now you can argue: "this does happen very rarely" - where I'm also > > with you. But that doesn't mean that those counters aren't buggy. > > Since the LDB is a public library for LDAP-like directories I would > > be very pleased to see this fixed. > > To achieve this I prepared a patch which changes "signed" types to > > "unsigned" ones which fix the problem. It passes "make test" and > > doesn't seem to give problems. The only thing I would like to ask you > > is to review it, so I could push it soon. > > Matthias, > > As you may have come to expect, my first request is that you split the > patch up. There are parts of this that look safe, and many parts that > need to looked at more carefully. > > The changes the the Python look most dubious, as you are changing the > type of something that is then interpreted as an integer in the python. > (so please remove all the Python changes). > > You also have typo fixes and changes to use LDB_SUCCESS mixed in the > same patch. > > Once you do that, I'll take another look, and we can see if some of > these can be easily merged. Perhaps also produce the patches with more > context? It's hard to tell from the changed declaration what the effect > would be (otherwise I'll have to do the review in gitk). > > Thanks, > > Andrew Bartlett > > -- > Andrew Bartlett http://samba.org/~abartlet/ > Authentication Developer, Samba Team http://samba.org > Samba Developer, Cisco Inc. > > > > > -- Jelmer Vernooij <jelmer@...> - http://samba.org/~jelmer/ Jabber: jelmer@... |
|
|
Re: LDB: Patch for fixing counter variablesThanks Jelmer, for your clear explaination. I will drop the python part. To be honest I wasn't very about it.
Matthias --- Jelmer Vernooij <jelmer@...> schrieb am Sa, 7.11.2009: Von: Jelmer Vernooij <jelmer@...> Betreff: Re: LDB: Patch for fixing counter variables An: mdw@... CC: "Andrew Bartlett" <abartlet@...>, "Samba Technical Mailinglist" <samba-technical@...>, idra@... Datum: Samstag, 7. November 2009, 11:02 Hi matthias, On Sat, 2009-11-07 at 09:24 +0000, Matthias Dieter Wallnöfer wrote: > I don't have much time to chat now, so I submit you a new version of the patch: divided in C and python bindings. The Python binding changes are pointless. PyList_SetItem() takes an integer (not an unsigned integer), so changing this doesn't fix anything. If we really suspect we're going to run into ldb results or messages with more than billions of elements, we should just raise an exception if they don't fit in a Python list. Cheers, Jelmer > --- Andrew Bartlett <abartlet@...> schrieb am Sa, 7.11.2009: > > Von: Andrew Bartlett <abartlet@...> > Betreff: Re: LDB: Patch for fixing counter variables > An: mdw@... > CC: idra@..., "Samba Technical Mailinglist" <samba-technical@...> > Datum: Samstag, 7. November 2009, 1:37 > > On Fri, 2009-11-06 at 21:19 +0000, Matthias Dieter Wallnöfer wrote: > > Most of our LDB objects are stored in structures containing a dynamic > > array and a counter. The counter variable is typically an "unsigned > > integer". Until here it's all okay. But with the major part of the > > counter variables (i, j...) it's not: since they're "signed". Well, > > this doesn't give problems till we don't break the ~ 2,1 * 10^9 > > barrier - but the API is fine to go beyond: with the possible result > > of infinite loops (since we are never able to reach the maximum). > > Now you can argue: "this does happen very rarely" - where I'm also > > with you. But that doesn't mean that those counters aren't buggy. > > Since the LDB is a public library for LDAP-like directories I would > > be very pleased to see this fixed. > > To achieve this I prepared a patch which changes "signed" types to > > "unsigned" ones which fix the problem. It passes "make test" and > > doesn't seem to give problems. The only thing I would like to ask you > > is to review it, so I could push it soon. > > Matthias, > > As you may have come to expect, my first request is that you split the > patch up. There are parts of this that look safe, and many parts that > need to looked at more carefully. > > The changes the the Python look most dubious, as you are changing the > type of something that is then interpreted as an integer in the python. > (so please remove all the Python changes). > > You also have typo fixes and changes to use LDB_SUCCESS mixed in the > same patch. > > Once you do that, I'll take another look, and we can see if some of > these can be easily merged. Perhaps also produce the patches with more > context? It's hard to tell from the changed declaration what the effect > would be (otherwise I'll have to do the review in gitk). > > Thanks, > > Andrew Bartlett > > -- > Andrew Bartlett http://samba.org/~abartlet/ > Authentication Developer, Samba Team http://samba.org > Samba Developer, Cisco Inc. > > > > > -- Jelmer Vernooij <jelmer@...> - http://samba.org/~jelmer/ Jabber: jelmer@... |
|
|
Re: LDB: Patch for fixing counter variablesVolker Lendecke wrote:
> A quick side-remark for this: I've recently figured out that > for most arrays the explicit counter is superfluous, as we > now have talloc_array_length(). > > Volker > Ah, that would indeed be a nice method to skip those length/amount specification variables. But for LDB that means huge API changes which for sure aren't welcome. Anyway, my patch considers the wrong-typed "i,j" counters which would have to be corrected also with that approach. Matthias |
|
|
Re: LDB: Patch for fixing counter variablesAndrew,
I dropped the wrong part of the patch though your and Jelmer's suggestion. The other changes should fully compliy - I would like to remember: "make test" passes as without them - so nothing (additional) should have been broken. I hope that you aren't that picky on those changes since then we never get to an end. For the future I plan also to provide some patches on other similar s4 code. Of course if you notice an obvious mistake (like this with python) please inform me. But otherwise going through each part I don't see very efficient. Matthias Andrew Bartlett wrote: > On Fri, 2009-11-06 at 21:19 +0000, Matthias Dieter Wallnöfer wrote: > >> Most of our LDB objects are stored in structures containing a dynamic >> array and a counter. The counter variable is typically an "unsigned >> integer". Until here it's all okay. But with the major part of the >> counter variables (i, j...) it's not: since they're "signed". Well, >> this doesn't give problems till we don't break the ~ 2,1 * 10^9 >> barrier - but the API is fine to go beyond: with the possible result >> of infinite loops (since we are never able to reach the maximum). >> Now you can argue: "this does happen very rarely" - where I'm also >> with you. But that doesn't mean that those counters aren't buggy. >> Since the LDB is a public library for LDAP-like directories I would >> be very pleased to see this fixed. >> To achieve this I prepared a patch which changes "signed" types to >> "unsigned" ones which fix the problem. It passes "make test" and >> doesn't seem to give problems. The only thing I would like to ask you >> is to review it, so I could push it soon. >> > Matthias, > > As you may have come to expect, my first request is that you split the > patch up. There are parts of this that look safe, and many parts that > need to looked at more carefully. > > The changes the the Python look most dubious, as you are changing the > type of something that is then interpreted as an integer in the python. > (so please remove all the Python changes). > > You also have typo fixes and changes to use LDB_SUCCESS mixed in the > same patch. > > Once you do that, I'll take another look, and we can see if some of > these can be easily merged. Perhaps also produce the patches with more > context? It's hard to tell from the changed declaration what the effect > would be (otherwise I'll have to do the review in gitk). > > Thanks, > > Andrew Bartlett > > |
|
|
Re: LDB: Patch for fixing counter variablesOn Mon, 2009-11-09 at 16:28 +0100, Matthias Dieter Wallnöfer wrote:
> Andrew, > > I dropped the wrong part of the patch though your and Jelmer's > suggestion. The other changes should fully compliy - I would like to > remember: "make test" passes as without them - so nothing (additional) > should have been broken. > > I hope that you aren't that picky on those changes since then we never > get to an end. For the future I plan also to provide some patches on > other similar s4 code. Of course if you notice an obvious mistake (like > this with python) please inform me. But otherwise going through each > part I don't see very efficient. slip in other changes and subtle bugs in a big 'cleanup' patch. That's why we need to be careful doing this, and why splitting things up helps, because it allows me or others to say '1 is ok, still thinking about 2'. The more you split it up, the easier it is for me. Also, for me the whole question of 'is there any point' is still very much open. That is, the boundary condition needs to be *way* below 2^31 (can we really allocate that many elements?), and raising it to 2^32 just seems to change the rollover point. BTW, See: http://xkcd.com/571/ Most of the one I've looked at seem OK, but because it's so large I've not been able to look at them all. However some things look like they could be improved, such as the change to ldb_msg_remove_element(), which looks like it should be a ptrdiff_t, and check for both above and below the range. And this change: char *ldb_binary_encode(void *mem_ctx, struct ldb_val val) { - int i; char *ret; - int len = val.length; + unsigned int i, len = val.length; unsigned char *buf = val.data; Would have been better as just a search/replace - the construction 'i, len = val.length' isn't as nice. I'm sorry this is so tedious, and to be so picky. Like the const changes, it will take time to check all these (and 'make test' generally isn't sufficient on these kind of changes). Andrew Bartlett -- Andrew Bartlett http://samba.org/~abartlet/ Authentication Developer, Samba Team http://samba.org Samba Developer, Cisco Inc. |
|
|
Re: LDB: Patch for fixing counter variablesHi Matthias,
> That is, the boundary condition needs to be *way* below 2^31 (can we > really allocate that many elements?), and raising it to 2^32 just seems > to change the rollover point. Andrew is right about that, we can't ever allocate anything like that number of elements in ldb. For a start, we limit talloc allocations to 256M, plus lots of the internal algorithms in ldb would start to fail. There are a few things that are O(N^2) in ldb. If N approaches 2^31, then N^2 will be 2^62. Think how long it will take to process that :-) Changing to unsigned counters isn't wrong, but it does need to be done very carefully, and I don't see it gaining us anything. Matthias, would you mind if I suggested a more worthwhile project related to ldb and 'for' loops? A really worthwhile project would be to find the places that are O(N^2) or larger and find a way to fix them. To do that you might like to start by finding all for loop nesting, or loops that call functions that loop. To give you an example, consider the code in ltdb_modify_internal() which checks to see that we don't have a multi-valued attribute with the same attribute twice (look near line 691 of lib/ldb/ldb_tdb/ldb_tdb.c). Now think about what happens when we modify an index record, which may have thousands of @IDX entries (eg. the objectclass=user index, or a one-level index). Adding 10k users will call this function 10k times on that record. After adding N users, each call of that function will cost O(N^2), as it checks all elements being stored with all previous elements. So that means adding 10k users will cost O(N^3). So it will cost about 10^12 memcmp() calls (the calls are in ldb_val_equal_exact()). Each memcmp compares on average about 10^2 bytes, so in total we memcmp() about 10^14 bytes. A modern CPU can compare cached memory at a rate of about 4*10^9 bytes/s, so those memcmp() calls are costing us about 25000 seconds (ie. about 6 hours). The reason I'm suggesting this is you are concerned about large lists of elements in ldb. If you plug N=2^31 into the above, then you'll see that to add that number of users in the current ldb would cost about 2^95 time. The earth (and probably the universe!) would be well and truly dead by the time we'd added that number of users. The fix is to develop hashing, binary search and sort functions that will allow ldb to scale to large sizes. That fix is much more important than going from 2^31 to 2^32. In the example above, if we used a qsort() of the elements then checked for duplicates by checking neighboring elements then the total cost of that part of the code would drop from 10^14 to about 10^10. So instead of taking 6 hours for those few lines of code to run it would take about 2 seconds. There are lots of examples like this in Samba4 (and probably in Samba3 too!), and there is an especially large concentration of them in ldb. We rarely test for real scalability, and that means that the developers aren't thinking in terms of big-O calculations when they write the code. If you can help us fix that, then maybe someday we can actually think about approaching 2^31 elements. Then signed/unsigned for the loop counters will matter :-) Cheers, Tridge PS: All numbers in the above are very rough of course - but try adding a few thousand users with a profiler running and you'll see what I mean |
|
|
Re: LDB: Patch for fixing counter variablesAndrew,
> It isn't very efficient, but just like the const changes it is easy to > slip in other changes and subtle bugs in a big 'cleanup' patch. > > That's why we need to be careful doing this, and why splitting things up > helps, because it allows me or others to say '1 is ok, still thinking > about 2'. The more you split it up, the easier it is for me. > But for me more work-intensive. Isn't it possible for you to provide me a list of functions/files which should be reviewed or corrected? > Also, for me the whole question of 'is there any point' is still very > much open. > > That is, the boundary condition needs to be *way* below 2^31 (can we > really allocate that many elements?), and raising it to 2^32 just seems > to change the rollover point. > BTW, See: http://xkcd.com/571/ > Well, we can also let this changes. But then I would like to see also the size specifiers ("num_elements", "num_attributes"...) as signed int (e.g. when we would have implemented this in Java this would be the right solution). I only dislike this kind of mixing. > Most of the one I've looked at seem OK, but because it's so large I've > not been able to look at them all. However some things look like they > could be improved, such as the change to ldb_msg_remove_element(), which > looks like it should be a ptrdiff_t, and check for both above and below > the range. > > And this change: > char *ldb_binary_encode(void *mem_ctx, struct ldb_val val) > { > - int i; > char *ret; > - int len = val.length; > + unsigned int i, len = val.length; > unsigned char *buf = val.data; > > Would have been better as just a search/replace - the construction 'i, > len = val.length' isn't as nice. > > I'm sorry this is so tedious, and to be so picky. Like the const > changes, it will take time to check all these (and 'make test' generally > isn't sufficient on these kind of changes). > But still I see this not completely unimportant. I handle this as a side project like the "const"s next to my actual "main project" ("password_hash"). I try to provide a first mergeable version soon (which does support changes from the administrator side with password policies and for now lets the "samdb_set_password" call in the actual form to handle also user passwords right). I only need to rethink about the removed attribute constraints which you pointed out in another post. Matthias |
| Free embeddable forum powered by Nabble | Forum Help |