|
View:
New views
5 Messages
—
Rating Filter:
Alert me
|
|
|
Alternative persistence systemSummary
I've spent some time using the E persistence mechanism now, and I've found it to be clever, elegant and very hard to use for my purposes. I ended up writing my own replacement for the timeMachine and makeSturdyRef objects. Below is a report on the problems I had and the solution I ended up with. Has anyone else had similar issues? Problems with the default system My test case is a fairly large program with many different kinds of objects, most of which need to be made persistent and exported as SturdyRefs. The program is divided into many modules, and plugins may extend it. Using the default E mechanism (makeSturdyRef and timeMachine), I found these problems: 1. Scalability: The system does not scale well. When an object's state changes, the entire object graph has to be written out, which takes time proportional to the number of objects in the system (roughly 10 ms per object on my machine), not to the number of changes. Fixing this would, presumably, require making each persistent object's representation be independent of that of all other persistent objects. Ideally, I'd like to save after creating every persistent object, before giving the sturdy ref to the user. I expect to be creating many such objects per second in some cases. 2. Redundant information: an object's portrayal must include all of its authority. If I have 1000 "job" objects, each with access to "timer", then the saved file will include 1000 references to "timer". This could be fixed by referring to the "parent" object (the one that originally created this one) instead (e.g. revive using "parent.makeJob()"), but this still results in 1000 references to the parent. Also, this solution conflicts with (1), since we want to make the portrayals independent, and it requires giving the object access to its parent, which may not be desirable from a security point of view. 3. Difficulty upgrading: If I have a saved file containing "job" objects without timers, and I now decide that job objects should have timers, there is no easy way to add them later (at least, not without messing around with the surgeon's exits). Also, if I give an object access to a subdirectory, it persists with an absolute pathname and I can't restart in a different directory. 4. Organisation: In my systems at least, the owner of the service/vat should be able to see the state of the system and discover all objects. Each object is owned by some parent object, which maintains a list of its children. The E persistence system makes it easy to have objects which are exported and persistent but which are not owned by any object. You would have to catch and handle exceptions very carefully to ensure that this couldn't happen. Also, if I give makeSturdyRef to an object, I have no control over the objects it creates. I want to group objects so that I know where they came from and can destroy the whole group at once. 5. Safety: Without persistence, objects accept authority but don't generally give it out (unless that's part of their function). Making an object persistent can be done in two ways (__optUncall and __optSealedDispatch). The first is easy but unsafe, the second is harder but safer (though still with some issues, as mentioned previously). A typical programmer, not too concerned with security, has a reasonable chance of writing a fairly secure E object that doesn't expose more authority than it should. However, they are very likely to take the easy and less secure option of using __optUncall for persistence. 6. Too many code paths: A persistable object must implement three code paths: create, save and revive. Most objects are not designed for persistence and only support the first case. An object which depends on an unpersistable object is also unpersistable. For example, if I call makeObject(file.deepReadOnly()) then the resulting object cannot be persisted, because read-only files cannot be. Also, the revive operation must be made public so that the persistence system can call it. This may be safe, but it is not good API design as other people may start using it by mistake. A possible solution While perhaps not as elegant as E's system, my replacement works for my largish use-case and solves the above problems (while preserving the essential property that objects can't take advantage of the persistence system to gain authority). Persistent objects are arranged in a tree. When saved, each node contains the Swiss base of the object and a method call on the parent that would re-create the object. Each node in the tree is actually three E objects: - A "builder" object. - A "persistNode" object (provided by the persistence system and holding the Swiss base). - A "public" object, created by the builder. Holders of the SturdyRef can call methods on this object. The root builder object is provided by the application. All other builders are created by their parent builders. For example, a chat server managing chat rooms might look like this: def makeChatServer() { return def chatServer { to makePublic(persistNode) { return def chatServerPub { to createChatRoom(name) { require(validRoomName(name)) return persistNode.makeSturdyChild("loadChatRoom", [name]) } } } to loadChatRoom(name) { return makeChatRoom(name) } } } (imagine that this chat system is just a small sub-module of the main application, without access to the surgeon, etc) On startup, the persistence system will: - Take the root builder (perhaps a chatServer) as input. - Create a persistNode for it. - Revive all saved children, by calling methods on chatServer (e.g. "loadChatRoom"). - Create the public object (chatServer.makePublic(persistNode)) and register it with identityMgr. Similiarly, makeChatRoom() returns a builder for chat rooms. This builder's makePublic will be called with its own persistNode, allowing the chat room to manage its own children (e.g. bots). If a chat room needs extra authority (e.g. a timer or a file for saving the history), we don't need to change the on-disk format, just the loadChatRoom method, e.g. to loadChatRoom(name) { return makeChatRoom(name, timer, <file:rooms>[name]) } We can give any authority this way, not just persistable authorities (e.g. we could pass a verb facet or a shallow-read-only directory to makeChatRoom, which we couldn't do with the default system). This seems to address the points above: 1. Scalability: It seems feasible that each object can be persisted independently (although my implementation doesn't currently do this). 2. Redundant information: We only need to persist unique information about each object. Everything else can be calculated anew at revival time. Saved files are smaller and easier to read. 3. Difficulty upgrading: Because the on-disk format only contains the key information, not incidental authority, it's easy to add or remove authority, regenerate pathnames relative to a new base, etc. 4. Organisation: Every persistent object is organised into a hierarchy. An object without a parent cannot be represented. Destroying an object destroys all of its descendants automatically. 5. Safety: Objects don't need to export their authority ever, and they don't need to hold a reference to their parents. 6. Too many code paths: The creation path exercises all code (e.g. createChatRoom() uses the persistence system to create each room object the first time too, not just to revive them). If an object can be made sturdy, it is very likely it will save and restore correctly too. Finally, this system ensures that objects are revived in a predictable order (a parent builder before its children, the parent public object after them). -- Dr Thomas Leonard IT Innovation Centre 2 Venture Road Southampton Hampshire SO16 7NP Tel: +44 0 23 8076 0834 Fax: +44 0 23 8076 0833 mailto:tal@... http://www.it-innovation.soton.ac.uk _______________________________________________ e-lang mailing list e-lang@... http://www.eros-os.org/mailman/listinfo/e-lang |
|
|
Re: Alternative persistence systemOn Oct 12, 2009, at 5:58, Thomas Leonard wrote:
> Summary > > I've spent some time using the E persistence mechanism now, and I've > found it to be clever, elegant and very hard to use for my purposes. I > ended up writing my own replacement for the timeMachine and > makeSturdyRef objects. Below is a report on the problems I had and the > solution I ended up with. Has anyone else had similar issues? > > > Problems with the default system > > My test case is a fairly large program with many different kinds of > objects, most of which need to be made persistent and exported as > SturdyRefs. The program is divided into many modules, and plugins may > extend it. Using the default E mechanism (makeSturdyRef and > timeMachine), I found these problems: > > 1. Scalability: The system does not scale well. When an object's state > changes, the entire object graph has to be written out, which takes > time > proportional to the number of objects in the system (roughly 10 ms per > object on my machine), not to the number of changes. Fixing this > would, > presumably, require making each persistent object's representation be > independent of that of all other persistent objects. Ideally, I'd like > to save after creating every persistent object, before giving the > sturdy > ref to the user. I expect to be creating many such objects per > second in > some cases. This is an efficiency problem; E-on-Java is just not very fast at executing E code, which includes the implementation of the persistence subsystem. Due to E's requirements for consistency on revival, an entire vat *must* be persisted as a unit. (Of course, if an application such as yours has weaker requirements you can use an alternate system.) > 2. Redundant information: an object's portrayal must include all of > its > authority. If I have 1000 "job" objects, each with access to "timer", > then the saved file will include 1000 references to "timer". This > could > be fixed by referring to the "parent" object (the one that originally > created this one) instead (e.g. revive using "parent.makeJob()"), but > this still results in 1000 references to the parent. Also, this > solution > conflicts with (1), since we want to make the portrayals independent, > and it requires giving the object access to its parent, which may > not be > desirable from a security point of view. The repeated references are necessary to preserve capability security. However, if an object needs multiple authorities, say 'timer' and 'stdout', then one thing you can do is have it persist as a reference to a bundle of them: def jobAuthority { # which is an exit or gotten from some loader to timer() { return timer } to stdout() { return stdout } } > 3. Difficulty upgrading: If I have a saved file containing "job" > objects > without timers, and I now decide that job objects should have timers, > there is no easy way to add them later (at least, not without messing > around with the surgeon's exits). This is a hard problem in general, but if you use the authority bundle above then you can just change the bundle and every job automatically gets that authority when revived. I've also imagined having a tool to basically do robust search-and- replace on serialized files, which would be able to handle the 'adding authority' problem in general. > Also, if I give an object access to a subdirectory, it persists with > an absolute pathname and I can't restart in a different directory. This is a problem in the legacy file access subsystem, not the persistence subsystem. One possible solution (which *could* be built as a layer on top, or built in): Create "RootRelativeFile" objects with the interface makeRootRelativeFile(root :any, subpath :String) such that they behave like the file root[subpath] but persist as this representation and construct more of themselves (a membrane) when sub- file references are retrieved from them. Then make the root-dir your app uses an object which is switchable to forwards to whatever you currently want the application root directory to be -- or, perhaps, just a graph exit which you revive as whatever directory. Yes, this is additional complexity, but it is I think useful for many applications besides yours. Realize that E's standard library is nowhere near "complete" in having every basic capability utility one ought to want. > 4. Organisation: In my systems at least, the owner of the service/vat > should be able to see the state of the system and discover all > objects. > Each object is owned by some parent object, which maintains a list of > its children. The E persistence system makes it easy to have objects > which are exported and persistent but which are not owned by any > object. > You would have to catch and handle exceptions very carefully to ensure > that this couldn't happen. This is fixable generically: write the objects so that they (have just enough authority to) check with their parents to make sure they are properly registered, and become nonfunctional if they aren't. > Also, if I give makeSturdyRef to an object, I > have no control over the objects it creates. I want to group objects > so > that I know where they came from and can destroy the whole group at > once. Follow capability practice by subdividing authority. Write a caretaker wrapper around makeSturdyRef which records the refs created and can be destroyed as a group. > 5. Safety: Without persistence, objects accept authority but don't > generally give it out (unless that's part of their function). Making > an > object persistent can be done in two ways (__optUncall and > __optSealedDispatch). The first is easy but unsafe, the second is > harder > but safer (though still with some issues, as mentioned previously). A > typical programmer, not too concerned with security, has a reasonable > chance of writing a fairly secure E object that doesn't expose more > authority than it should. However, they are very likely to take the > easy > and less secure option of using __optUncall for persistence. In principle it could be reduced to one extra call with a suitable library: to __optSealedDispatch(b) { return doPersistence(b, fn { [makeWhatever, ...] }) } But I suspect that your hypothetical "almost knows what to do" programmer would fail to write secure code in other ways anyway. > 6. Too many code paths: A persistable object must implement three code > paths: create, save and revive. Most objects are not designed for > persistence and only support the first case. An object which depends > on > an unpersistable object is also unpersistable. For example, if I call > makeObject(file.deepReadOnly()) then the resulting object cannot be > persisted, because read-only files cannot be. Also, the revive > operation > must be made public so that the persistence system can call it. This > may > be safe, but it is not good API design as other people may start using > it by mistake. The "revive" operation *should*, when possible, be the same as the "create" operation. Exceptions should be reviewed with suspicion. Makers-for-revival *are* part of the public API because as soon as your app is deployed, people have saved data which uses those interfaces. You have to preserve compatibility or announce breakage/ support migration just like with any other public interface. (Think of it like ABI/"binary compatibility" in C shared libraries.) That read-only files are unpersistable is a bug. (You can work around it by adding a loader to the surgeon which recognizes read-only files.) > A possible solution ... I suspect that your solution is, in general, able to work more straightforwardly *for your application* because you have additional constraints: 1. Your objects are arranged in a hierarchy. 2. You have no objects with which your application is mutually suspicious. To expand on the second point, your scheme of reviving objects with authority based on their parents would fail dangerously if the child object was not actually one of yours, but something which did not have that authority in the previous incarnation and now gets it. I don't know your real persistence infrastructure, so I can't say whether this actually makes sense, but that is the general form of my suspicion: that you have something which is easier to use, but either less powerful or unsafe-given-untrusted-code (depending on the details). -- Kevin Reid <http://switchb.org/kpreid/> _______________________________________________ e-lang mailing list e-lang@... http://www.eros-os.org/mailman/listinfo/e-lang |
|
|
Re: Alternative persistence systemOn Mon, 2009-10-12 at 09:25 -0400, Kevin Reid wrote:
> On Oct 12, 2009, at 5:58, Thomas Leonard wrote: [...] > > 1. Scalability: The system does not scale well. When an object's state > > changes, the entire object graph has to be written out, which takes > > time > > proportional to the number of objects in the system (roughly 10 ms per > > object on my machine), not to the number of changes. Fixing this > > would, > > presumably, require making each persistent object's representation be > > independent of that of all other persistent objects. Ideally, I'd like > > to save after creating every persistent object, before giving the > > sturdy > > ref to the user. I expect to be creating many such objects per > > second in > > some cases. > > This is an efficiency problem; E-on-Java is just not very fast at > executing E code, which includes the implementation of the persistence > subsystem. It's not so much the speed (although that's not great), it's that it doesn't scale up if I have e.g. thousands of objects or more (which doesn't seem unreasonable). I'll probably want to save each object as a row in a database at some point. > Due to E's requirements for consistency on revival, an entire vat > *must* be persisted as a unit. (Of course, if an application such as > yours has weaker requirements you can use an alternate system.) I had to violate this anyway, because my application allows users to upload large files, which get stored in the file-system, not in memory. In this case, snapshots make things worse, because an infrequent snapshot is less likely to match the rest of the saved state. e.g. if I revoke someone's access to a storage area and then upload confidential data and then the server crashes, it will revive with them still having access to the file. If saving was fast, the storage area object could ensure that the revocation was saved before returning. I'll probably need to put logs and usage data in the file-system too. For some applications you may need to snapshot the whole state, but the scheme below doesn't prevent that (and in fact the current version does snapshot everything at once). But I think there must be a large set of applications where this is not useful. > > 2. Redundant information: [...] > The repeated references are necessary to preserve capability security. > However, if an object needs multiple authorities, say 'timer' and > 'stdout', then one thing you can do is have it persist as a reference > to a bundle of them: > > def jobAuthority { # which is an exit or gotten from some loader > to timer() { return timer } > to stdout() { return stdout } > } Using a loader would conflict with (1), and allowing modules to call surgeon.addExit didn't look safe (what if one module replaces another module's exit?). > > 6. Too many code paths: [...] > The "revive" operation *should*, when possible, be the same as the > "create" operation. Exceptions should be reviewed with suspicion. Anything with a default state seems to be an exception. Either you get the user to provide your default state (as a mutable object), or you need separate methods. Admittedly, the create operation should normally call the revive one internally. > Makers-for-revival *are* part of the public API because as soon as > your app is deployed, people have saved data which uses those > interfaces. True, but if I know that the only callers are previous versions of my own code then providing an upgrade path is easier. > > A possible solution > ... Thanks for looking at this. I want to make sure I've got a reasonable system. > I suspect that your solution is, in general, able to work more > straightforwardly *for your application* because you have additional > constraints: > > 1. Your objects are arranged in a hierarchy. Yes. Although you could regard E's current system as a special case of this: a simple two-level hierarchy with all objects being children of the SturdyRefMaker (in myOptSwissRetainers), and with a load function that adds no authority. > 2. You have no objects with which your application is mutually > suspicious. I don't think that's the case, except that an object always trusts its creator (there's not much you can do about that, after all). But objects don't trust their children, in general, or other objects. > To expand on the second point, your scheme of reviving objects with > authority based on their parents would fail dangerously if the child > object was not actually one of yours, but something which did not have > that authority in the previous incarnation and now gets it. How can the child not be one of mine? The parent tells the persistence system how to revive the child, e.g. def makeChatServer(timer) { return def chatServer { to makePublic(persistNode) { return def chatServerPub { to createChatRoom(name) { require(validRoomName(name)) return persistNode.makeSturdyChild("loadChatRoom", [name]) } } } to loadChatRoom(name) { return makeChatRoom(name, timer, <file:rooms>[name]) } } } The only children of persistNode are those added by createChatRoom (since that's the only thing with access to persistNode.makeSturdyChild), and they're created by loadChatRoom. Each chatRoom will get its own persistNode; it can't add more children to the chatServer and it can't change the name of the load function (if the persisted argument "name" was a mutable object then it could change that, because we pass it to makeChatRoom). > I don't know your real persistence infrastructure, so I can't say > whether this actually makes sense, but that is the general form of my > suspicion: that you have something which is easier to use, but either > less powerful or unsafe-given-untrusted-code (depending on the details). I'd certainly like to make sure that it isn't less safe. -- Dr Thomas Leonard IT Innovation Centre 2 Venture Road Southampton Hampshire SO16 7NP Tel: +44 0 23 8076 0834 Fax: +44 0 23 8076 0833 mailto:tal@... http://www.it-innovation.soton.ac.uk _______________________________________________ e-lang mailing list e-lang@... http://www.eros-os.org/mailman/listinfo/e-lang |
|
|
Re: Alternative persistence systemKevin Reid wrote:
> On Oct 12, 2009, at 5:58, Thomas Leonard wrote: >> The system does not scale well. When an object's state >> changes, the entire object graph has to be written out, which takes >> time >> proportional to the number of objects in the system (roughly 10 ms per >> object on my machine), not to the number of changes. > > Due to E's requirements for consistency on revival, an entire vat > *must* be persisted as a unit. (Of course, if an application such as > yours has weaker requirements you can use an alternate system.) That does not imply that the implementation must take time proportional to the size of the vat. KeyKOS and CapROS only write out objects that were changed since the last checkpoint. _______________________________________________ e-lang mailing list e-lang@... http://www.eros-os.org/mailman/listinfo/e-lang |
|
|
Re: Alternative persistence systemOn Oct 12, 2009, at 18:02, Charles Landau wrote:
> Kevin Reid wrote: >> On Oct 12, 2009, at 5:58, Thomas Leonard wrote: >>> The system does not scale well. When an object's state >>> changes, the entire object graph has to be written out, which takes >>> time >>> proportional to the number of objects in the system (roughly 10 ms >>> per >>> object on my machine), not to the number of changes. >> >> Due to E's requirements for consistency on revival, an entire vat >> *must* be persisted as a unit. (Of course, if an application such as >> yours has weaker requirements you can use an alternate system.) > > That does not imply that the implementation must take time > proportional > to the size of the vat. KeyKOS and CapROS only write out objects that > were changed since the last checkpoint. KeyKOS and CapROS use orthogonal persistence, which greatly simplifies the problem. -- Kevin Reid <http://switchb.org/kpreid/> _______________________________________________ e-lang mailing list e-lang@... http://www.eros-os.org/mailman/listinfo/e-lang |
| Free embeddable forum powered by Nabble | Forum Help |