
|
Re: Terracotta integration
This is the agreed upon approach mimicking the DiskPageStore. No?
--Ari
--
Sent from my handheld
[Message delivered by NotifyLink]
----------Original Message----------
From: "Johan Compagner" < jcompagner@...>
Sent: Thu, July 03, 2008 11:29 AM
To: dev@...
Subject: Re: Terracotta integration
AbstractPagStore clusterable?
I think thats is not what it supposed to be.
On 7/3/08, richardwilko < richardjohnwilkinson@...> wrote:
>
> Hi,
>
> I have attached a first take on a the terracottapagestore and would
> appreciate any input anyone can give me, particularly to do with
> concurrency, i.e. should i be using synchronised methods on X.
>
> I have done some limited testing with this, 2 apps, using load balancer to
> bounce each new request over the machine, and only 1 user and everything
> seems to be working ok.
>
> I know that it currently has generics, but they can be removed no problem, I
> just find it easier to code including them.
>
> One extra note, I had to manually instrument
> org.apache.wicket.protocol.http.pagestore.AbstractPageStore and
> org.apache.wicket.protocol.http.pagestore.AbstractPageStore$SerializedPage,
> so if this does make it into the wicket core it would be easier if those two
> classes could implement IClusterable.
>
> Thanks,
>
> Richard
>
> http://www.nabble.com/file/p18257188/TerracottaPageStore.java> TerracottaPageStore.java
> --
> View this message in context:
> http://www.nabble.com/Terracotta-integration-tp18168616p18257188.html> Sent from the Wicket - Dev mailing list archive at Nabble.com.
>
>
|

|
Re: Terracotta integration
On Thu, Jul 3, 2008 at 11:37 AM, Ari Zilka < ari@...> wrote:
> This is the agreed upon approach mimicking the DiskPageStore. No?
Well, the DiskPageStore takes care of storing pages, but by itself
would never be transferred across a cluster. So making it IClusterable
wouldn't make sense.
Eelco
|

|
Re: Terracotta integration
I'm not trying to cluster the diskpagestore, im trying to implement a PageStore than can be clustered.
I needed to instrument the AbstractPagStore class becauase it is referenced by the inner classes of the TerracottaPageStore (the ones which are clustered), however I was thinking about this and have decided that I need to come up with a better way so that it does not need to be instrumented, probably by not using inner classes. I wasnt actually wanting to cluster the entire AbstractPageStore class.
The org.apache.wicket.protocol.http.pagestore.AbstractPageStore$SerializedPage class definitely does need to be clustered though as this is the object which holds the page in the clustered page store.
I havent created a JIRA issue yet as I wanted to get the code working first.
Cheers,
Richard
Eelco Hillenius wrote:
On Thu, Jul 3, 2008 at 11:37 AM, Ari Zilka <ari@terracottatech.com> wrote:
> This is the agreed upon approach mimicking the DiskPageStore. No?
Well, the DiskPageStore takes care of storing pages, but by itself
would never be transferred across a cluster. So making it IClusterable
wouldn't make sense.
Eelco
|

|
Re: Terracotta integration
It seems a shame that this workaround is required to get terracotta to be able to handle the amount of garbage created. Would be great if TC could be configured to serialise WebPage's and their Components as single entities. Is anything like this in the roadmap?
JD
The current terracotta integration does use the HTTPSessionStore, infact the integration module forces the HTTPSessionStore using the bytecode manipulation stuff it uses.
What i've found is that the key with any terracotta clustering is that you cannot cluster page objects as this causes too much garbage, instead you have to cluster a byte array of the serialised page object instead (this is what my new IPageMapEntry implementation does).
Would replacing the disk based store in the default session store with one that stores serialised pages in some sort of terracotta distributed map in memory work? Although would that still leave the last accessed page as a page object in memory?
Cheers,
Richard
Matej Knopp-2 wrote:
Hi,
the thing is a bit more complicated than just that :) I'm not sure how
Wicket terractotta integration works currently. Last time I checked it
was recommended to use HTTPSessionStore which itself is a big drawback
as the HTTPSessionStore has certain backbutton problems that can not
be resolved at least until 1.5.
As for DiskPageStore, Wicket itself contains support for session
replication that is however targetted to "regular" clusters assuming
that after each request changed session properties are serialized and
replicated to backing nodes (and preferably immediately deserialized -
explanation below).
DiskPageStore's purpose is to store serialized pages on disk in order
to allow access to previous page versions and also to conserve session
memory usage. So DiskPageStore serializes page during each request.
The serialization happens whether wicket runs on cluster on not.
However, when wicket is running on cluster, the already serialized
data is used during session replication, so the pages are not
serialized again. This is an important optimization that is resulting
in fact that there is no additional serialization penalty when running
wicket on (regular) cluster.
When target node receives the serialized session entries during
replication and deserializes it (it is recommended to configure the
container to deserialize session properties immediately) the page
itself is not really deserialized. It is only stored in target node's
diskpagestore, so there is no page deserialization happening and the
actual page data is not held in memory on target node.
This is roughly the idea of running Wicket application on a standard
replicated cluster. Now I'm not sure about TC, but IIRC the default
Wicket TC setup used HTTPSessionStore (which doesn't contain any
"special" support for session replication).
I think it would be very nice to have TC leverage the existing
replication support of Wicket's DiskPageStore acting more like a
simple session replication. WDYT?
Cheers,
-Matej
On Sat, Jun 28, 2008 at 10:31 AM, Ari Zilka <ari@terracottatech.com> wrote:
> Hello all,
>
> I work for Terracotta on JVM-level clustering type stuff. I recently spoke
> with a Wicket + TC user who went to production and uncovered a nasty
> surprise. I wanted to propose to this community that we consider
> incorporating the changes he needed to make in order to cluster and scale
> well.
>
> He--Richard Wilkinson--documented his changes on his blog here:
> http://www.richard-wilkinson.co.uk/2008/06/22/more-on-terracotta-and-wicket/>
> I look forward to discussing the possible ways forward.
>
> Cheers,
>
> --Ari
>
|

|
Re: Terracotta integration
Afaik you can't configure terracotta to serialise objects (its kind of the opposite of what it tries to achieve), however
simply serialising the webpages doesnt work in all cases anyway (thats what my original solution did), for example when you have a reference to one page inside another you can end up with the wrong version of that referenced page.
The TerracottaPageStore I am working on will take care of this, and when it is ready the only change you will need to make is adding something like this in your application class:
public ISessionStore newSessionStore() {
return new SecondLevelCacheSessionStore(this, new TerracottaPageStore(5, 5));
}
it is also posible that this could be added automatically with byte-code manipulation.
John Patterson wrote:
It seems a shame that this workaround is required to get terracotta to be able to handle the amount of garbage created. Would be great if TC could be configured to serialise WebPage's and their Components as single entities. Is anything like this in the roadmap?
JD
|

|
Re: Terracotta integration
Hi all,
isn't DiskPageStore a singleton? So putting it into the Session (i.e. making it IClusterable) would definitely be the wrong approach. I think there are two possibilities:
a) configure am additional root in TC's wicket-module that is used just like the filesystem is used from DiskPageStore (let's call that "a clustered filesystem" for that purpose). So every JVM would have it's singelton DiskPageStore that would act as a facade to this clustered filesystem.
b) store each user's pages into the HttpSession (as it is done know) using the same serialization magic that DiskPageStore uses right know.
What do you think?
regards
Stefan
Afaik you can't configure terracotta to serialise objects (its kind of the opposite of what it tries to achieve), however
simply serialising the webpages doesnt work in all cases anyway (thats what my original solution did), for example when you have a reference to one page inside another you can end up with the wrong version of that referenced page.
The TerracottaPageStore I am working on will take care of this, and when it is ready the only change you will need to make is adding something like this in your application class:
public ISessionStore newSessionStore() {
return new SecondLevelCacheSessionStore(this, new TerracottaPageStore(5, 5));
}
it is also posible that this could be added automatically with byte-code manipulation.
John Patterson wrote:
It seems a shame that this workaround is required to get terracotta to be able to handle the amount of garbage created. Would be great if TC could be configured to serialise WebPage's and their Components as single entities. Is anything like this in the roadmap?
JD
|

|
Re: Terracotta integration
Hi,
Just because a class is instrumented (i.e. implements IClusterable) that doesn't mean that it is shared. It just means that it can be used in a shared structure if required.
The way I am implementing it is to store the pages in the httpsession as you suggest, this has nothing to do with DiskPageStore, except that it is another implementation of the IPageStore interface.
The reason why I had to instrument AbstractPageStore and TerracottaPageStore is because I am actually sharing some classes declared as inner classes in TerracottaPageStore.java and they have a reference to the parent class.
I was planning to move away from having inner classes, but as the serialisation and deserialisation methods in AbstractPageStore are protected, I cant really get around it.
Cheers,
Richard
Hi all,
isn't DiskPageStore a singleton? So putting it into the Session (i.e. making it IClusterable) would definitely be the wrong approach. I think there are two possibilities:
a) configure am additional root in TC's wicket-module that is used just like the filesystem is used from DiskPageStore (let's call that "a clustered filesystem" for that purpose). So every JVM would have it's singelton DiskPageStore that would act as a facade to this clustered filesystem.
b) store each user's pages into the HttpSession (as it is done know) using the same serialization magic that DiskPageStore uses right know.
What do you think?
regards
Stefan
richardwilko wrote:
Afaik you can't configure terracotta to serialise objects (its kind of the opposite of what it tries to achieve), however
simply serialising the webpages doesnt work in all cases anyway (thats what my original solution did), for example when you have a reference to one page inside another you can end up with the wrong version of that referenced page.
The TerracottaPageStore I am working on will take care of this, and when it is ready the only change you will need to make is adding something like this in your application class:
public ISessionStore newSessionStore() {
return new SecondLevelCacheSessionStore(this, new TerracottaPageStore(5, 5));
}
it is also posible that this could be added automatically with byte-code manipulation.
John Patterson wrote:
It seems a shame that this workaround is required to get terracotta to be able to handle the amount of garbage created. Would be great if TC could be configured to serialise WebPage's and their Components as single entities. Is anything like this in the roadmap?
JD
|

|
Re: Terracotta integration
I don't understand the problem. Is it just the visibility of those methods? If yes, TerracottaPageStore could allow public access to any protected method if needed. Or you could use static inner classes to remove this (hidden) reference and use lazy property initialization to get your hands on the current JVM's TerracottaSessionStore using:
<code>
private transient TerracottaSessionStore _tss;
public TerracottaSessionStore getTerracottaSessionStore() {
if (_tss == null) _tss = (TerracottaSesssionStore) Application.get().getSesssionStore();
return _tss;
}
</code>
However, doesn't instrumenting classes that aren't meant be shared sound like unnecessary overhead?
best regards
Stefan
Hi,
Just because a class is instrumented (i.e. implements IClusterable) that doesn't mean that it is shared. It just means that it can be used in a shared structure if required.
The way I am implementing it is to store the pages in the httpsession as you suggest, this has nothing to do with DiskPageStore, except that it is another implementation of the IPageStore interface.
The reason why I had to instrument AbstractPageStore and TerracottaPageStore is because I am actually sharing some classes declared as inner classes in TerracottaPageStore.java and they have a reference to the parent class.
I was planning to move away from having inner classes, but as the serialisation and deserialisation methods in AbstractPageStore are protected, I cant really get around it.
Cheers,
Richard
Hi all,
isn't DiskPageStore a singleton? So putting it into the Session (i.e. making it IClusterable) would definitely be the wrong approach. I think there are two possibilities:
a) configure am additional root in TC's wicket-module that is used just like the filesystem is used from DiskPageStore (let's call that "a clustered filesystem" for that purpose). So every JVM would have it's singelton DiskPageStore that would act as a facade to this clustered filesystem.
b) store each user's pages into the HttpSession (as it is done know) using the same serialization magic that DiskPageStore uses right know.
What do you think?
regards
Stefan
richardwilko wrote:
Afaik you can't configure terracotta to serialise objects (its kind of the opposite of what it tries to achieve), however
simply serialising the webpages doesnt work in all cases anyway (thats what my original solution did), for example when you have a reference to one page inside another you can end up with the wrong version of that referenced page.
The TerracottaPageStore I am working on will take care of this, and when it is ready the only change you will need to make is adding something like this in your application class:
public ISessionStore newSessionStore() {
return new SecondLevelCacheSessionStore(this, new TerracottaPageStore(5, 5));
}
it is also posible that this could be added automatically with byte-code manipulation.
John Patterson wrote:
It seems a shame that this workaround is required to get terracotta to be able to handle the amount of garbage created. Would be great if TC could be configured to serialise WebPage's and their Components as single entities. Is anything like this in the roadmap?
JD
|

|
Re: Terracotta integration
It does add a slight overhead but I dont think 2 extra classes wouldn't be noticed.
I can't use static inner classes because the methods in AbstractPageStore aren't static.
Your suggestion would work, with a slight modification:
(TerracottaPageStore) ((SecondLevelCacheSessionStore)Application.get().getSesssionStore()).getStore();
So I will have a look at implementing it that way instead.
Cheers,
Richard
Stefan Fußenegger wrote:
I don't understand the problem. Is it just the visibility of those methods? If yes, TerracottaPageStore could allow public access to any protected method if needed. Or you could use static inner classes to remove this (hidden) reference and use lazy property initialization to get your hands on the current JVM's TerracottaSessionStore using:
<code>
private transient TerracottaSessionStore _tss;
public TerracottaSessionStore getTerracottaSessionStore() {
if (_tss == null) _tss = (TerracottaSesssionStore) Application.get().getSesssionStore();
return _tss;
}
</code>
However, doesn't instrumenting classes that aren't meant be shared sound like unnecessary overhead?
best regards
Stefan
|

|
Re: Terracotta integration
Hi again,
I have put together a second version which does away with the need to instrument TerracottaPageStore and AbstractPageStore, but not AbstractPageStore$SerializedPage (no getting away from that).
I have also improved the synchronisation stuff (i think, its not my strong point) and added a few more comments.
In the end I did make the classes static inner classes; i moved all the code calls to the methods in AbstractPageStore, to other places.
Please take a look and tell me what you think.
Richard
TerracottaPageStore.java It does add a slight overhead but I dont think 2 extra classes wouldn't be noticed.
I can't use static inner classes because the methods in AbstractPageStore aren't static.
Your suggestion would work, with a slight modification:
(TerracottaPageStore) ((SecondLevelCacheSessionStore)Application.get().getSesssionStore()).getStore();
So I will have a look at implementing it that way instead.
Cheers,
Richard
Stefan Fußenegger wrote:
I don't understand the problem. Is it just the visibility of those methods? If yes, TerracottaPageStore could allow public access to any protected method if needed. Or you could use static inner classes to remove this (hidden) reference and use lazy property initialization to get your hands on the current JVM's TerracottaSessionStore using:
<code>
private transient TerracottaSessionStore _tss;
public TerracottaSessionStore getTerracottaSessionStore() {
if (_tss == null) _tss = (TerracottaSesssionStore) Application.get().getSesssionStore();
return _tss;
}
</code>
However, doesn't instrumenting classes that aren't meant be shared sound like unnecessary overhead?
best regards
Stefan
|

|
Re: Terracotta integration
Why not use the Terracotta Integration Module's capabilities to AOP-
inject the "implements interface" specification onto the class you want?
Remember, a TIM is not just an OSGi bundle of XML and new code, but
also tools to replace methods / classes when clustered as well as a
full-powered AOP engine to augment classes / methods as necessary.
--Ari
On Jul 4, 2008, at 1:19 AM, richardwilko wrote:
>
> I'm not trying to cluster the diskpagestore, im trying to implement a
> PageStore than can be clustered.
>
> I needed to instrument the AbstractPagStore class becauase it is
> referenced
> by the inner classes of the TerracottaPageStore (the ones which are
> clustered), however I was thinking about this and have decided that
> I need
> to come up with a better way so that it does not need to be
> instrumented,
> probably by not using inner classes. I wasnt actually wanting to
> cluster
> the entire AbstractPageStore class.
>
> The
> org.apache.wicket.protocol.http.pagestore.AbstractPageStore
> $SerializedPage
> class definitely does need to be clustered though as this is the
> object
> which holds the page in the clustered page store.
>
> I havent created a JIRA issue yet as I wanted to get the code
> working first.
>
> Cheers,
>
> Richard
>
>
>
> Eelco Hillenius wrote:
>>
>> On Thu, Jul 3, 2008 at 11:37 AM, Ari Zilka < ari@...>
>> wrote:
>>> This is the agreed upon approach mimicking the DiskPageStore. No?
>>
>> Well, the DiskPageStore takes care of storing pages, but by itself
>> would never be transferred across a cluster. So making it
>> IClusterable
>> wouldn't make sense.
>>
>> Eelco
>>
>>
>
> --
> View this message in context: http://www.nabble.com/Terracotta-integration-tp18168616p18275042.html> Sent from the Wicket - Dev mailing list archive at Nabble.com.
>
|

|
Re: Terracotta integration
richardwilko wrote:
Afaik you can't configure terracotta to serialise objects (its kind of the opposite of what it tries to achieve),
not strictly java.io.Serializing but at some level TC must convert Objects to bits to send them over the wire and persist them. Also manage them for garbage collection.
richardwilko wrote:
however simply serialising the webpages doesnt work in all cases anyway (thats what my original solution did), for example when you have a reference to one page inside another you can end up with the wrong version of that referenced page.
btw, even simple java.io.Serializing the pages can be made to object replace references with placeholders e.g. in readReplace()... but I guess you already do something similar.
richardwilko wrote:
The TerracottaPageStore I am working on will take care of this, and when it is ready the only change you will need to make is adding something like this in your application class:
public ISessionStore newSessionStore() {
return new SecondLevelCacheSessionStore(this, new TerracottaPageStore(5, 5));
}
it is also posible that this could be added automatically with byte-code manipulation.
Brilliant, can't wait to use it myself
John Patterson wrote:
Would be great if TC could be configured to serialise WebPage's and their Components as single entities.
Perhaps I should have said _manage_ WebPages and their Components as single entities. Am I right that this is the key issue requiring your workaround? TC does not know that wicket components are only referenced in a single page and so share the same lifecycle - a fact which could be used to increase its efficiency.
|

|
Re: Terracotta integration
John,
TC does not marshal objects in any way under user's control. It hooks
into memory, watches for changes, and pushes what your business logic
changes, on change. There is no way to invoke a marshaling task or
fire an event that would force us to marshal the objects you want.
(Well, there is but it is in the ManagerUtil classes and shouldn't be
used directly).
Also, TC honors object identity cluster-wide so it does in fact know
which widgets are associated with what pages, etc. It handles
singletons properly, again because it is not marshaling or otherwise
walking graphs.
I wish I understood the code in Wicket we are all talking about
better. I want to help more than just interjecting what TC can and
cannot do.
I think you guys are all close tho. How can I help further, everyone?
Cheers,
--Ari
On Jul 4, 2008, at 8:17 AM, John Patterson wrote:
>
>
>
> richardwilko wrote:
>>
>>
>> Afaik you can't configure terracotta to serialise objects (its kind
>> of the
>> opposite of what it tries to achieve),
>>
>
> not strictly java.io.Serializing but at some level TC must convert
> Objects
> to bits to send them over the wire and persist them. Also manage
> them for
> garbage collection.
>
>
> richardwilko wrote:
>>
>> however simply serialising the webpages doesnt work in all cases
>> anyway
>> (thats what my original solution did), for example when you have a
>> reference to one page inside another you can end up with the wrong
>> version
>> of that referenced page.
>>
>
> btw, even simple java.io.Serializing the pages can be made to object
> replace
> references with placeholders e.g. in readReplace()... but I guess you
> already do something similar.
>
>
> richardwilko wrote:
>>
>> The TerracottaPageStore I am working on will take care of this, and
>> when
>> it is ready the only change you will need to make is adding
>> something like
>> this in your application class:
>>
>> public ISessionStore newSessionStore() {
>> return new SecondLevelCacheSessionStore(this, new
>> TerracottaPageStore(5,
>> 5));
>> }
>>
>> it is also posible that this could be added automatically with byte-
>> code
>> manipulation.
>>
>
> Brilliant, can't wait to use it myself
>
>
> John Patterson wrote:
>>
>> Would be great if TC could be configured to serialise WebPage's and
>> their
>> Components as single entities.
>>
>
> Perhaps I should have said _manage_ WebPages and their Components as
> single
> entities. Am I right that this is the key issue requiring your
> workaround?
> TC does not know that wicket components are only referenced in a
> single page
> and so share the same lifecycle - a fact which could be used to
> increase its
> efficiency.
>
>
> --
> View this message in context: http://www.nabble.com/Terracotta-integration-tp18168616p18281339.html> Sent from the Wicket - Dev mailing list archive at Nabble.com.
>
|

|
Re: Terracotta integration
Ari Zilka wrote:
John,
TC does not marshal objects in any way under user's control. It hooks
into memory, watches for changes, and pushes what your business logic
changes, on change. There is no way to invoke a marshaling task or
fire an event that would force us to marshal the objects you want.
(Well, there is but it is in the ManagerUtil classes and shouldn't be
used directly).
I assume the bottle neck that Richard ran into is in TC's garbage collection? So by artificially reducing the number of objects TC has to manage (by serialising them into one object per page) he got around this. I wonder if perhaps TC should do this transparently on the server (not the client) if given the hint that certain classes always share the lifecycle of their parent. Much like how Hibernate components are managed as a part of their containing entity.
JD
|

|
Re: Terracotta integration
Hi Richard,
I had a thorough look on your code. I have the following remarks:
- yes, SerializedPage must be clustered and should therefore implement IClusterable (it is already Serializable, it should therefore be okay to change)
- I found two problems with your implementation:
1) unbind() is called during invalidation of a session. getPageStore() will therefore result in a NPE as there is no WebRequest
2) according to the JavaDoc of DiskPageStore#removePage(SessionEntry, String, int) ("page id to remove or -1 if the whole pagemap should be removed") calling removePage(String, String, int) with an id of -1 should delete all pages of a pageMap (however, that's not documented in the JavaDoc of IPageStore!)
- I feel that all pages could be in a single HashMap (rather than using 3 levels of nested HashMaps and HashSets). I therefore implemented my own PageStore based on your ideas to confirm my feelings (using a single HashMap per sesison, using less Hash(Map|Set) iterations; access synchronized using a ReentrantReadWriteLock which I think has quite good performance with TC). Please have a look. We can probably MyTerracottaPageStore.java merge our ideas for best results!
Regards
Stefan
MyTerracottaPageStore.java Hi again,
I have put together a second version which does away with the need to instrument TerracottaPageStore and AbstractPageStore, but not AbstractPageStore$SerializedPage (no getting away from that).
I have also improved the synchronisation stuff (i think, its not my strong point) and added a few more comments.
In the end I did make the classes static inner classes; i moved all the code calls to the methods in AbstractPageStore, to other places.
Please take a look and tell me what you think.
Richard
TerracottaPageStore.java
richardwilko wrote:
It does add a slight overhead but I dont think 2 extra classes wouldn't be noticed.
I can't use static inner classes because the methods in AbstractPageStore aren't static.
Your suggestion would work, with a slight modification:
(TerracottaPageStore) ((SecondLevelCacheSessionStore)Application.get().getSesssionStore()).getStore();
So I will have a look at implementing it that way instead.
Cheers,
Richard
Stefan Fußenegger wrote:
I don't understand the problem. Is it just the visibility of those methods? If yes, TerracottaPageStore could allow public access to any protected method if needed. Or you could use static inner classes to remove this (hidden) reference and use lazy property initialization to get your hands on the current JVM's TerracottaSessionStore using:
<code>
private transient TerracottaSessionStore _tss;
public TerracottaSessionStore getTerracottaSessionStore() {
if (_tss == null) _tss = (TerracottaSesssionStore) Application.get().getSesssionStore();
return _tss;
}
</code>
However, doesn't instrumenting classes that aren't meant be shared sound like unnecessary overhead?
best regards
Stefan
|

|
Re: Terracotta integration
Hi Stefan,
Looking through your code I see a couple of issues:
1) There is no limit on the number of pages stored in the pagemap, pages could get added forever. I feel there needs to be a way to limit the number of pages stored, with oldest ones discarded first. This is how DiskPageStore works.
2) Following on from point 1, a HashMap does not keep insertion order so it is not possible to remove the oldest ones easily. A simple change to LinkedHashMap would solve this and make point 1 easy to implement. However storing all the pagemaps together does mean that the most recent pages from one pagemap could get removed due to high use of another pagemap. In this case when the user goes back to the other pagemap he/she will encounter an exception.
3) Your getPage code is not general enough; from the javadocs for getPage in IPageStore:
* If ajaxVersionNumber is -1 and versionNumber is specified, the page store must return the page with highest ajax version.
* If both versionNumber and ajaxVersioNumber are -1, the pagestore must return last touched (saved) page version with given id.
Your method of constructing a key object wouldn't work in these situations, as it would only find exact matches, and so getPage would require iterating through the entire HashMap and looking at every entry.
This issue is the reason why I went for the nested structure I used. I do agree that a single storage map would ideally be better, especially as this make it easier to better manage the number of pages stored, but i'm not sure if it is the most efficient method of storage for the complex getPage requirements. By efficient i mean execution time rather than memory usage.
Thoughts?
Richard
Stefan Fußenegger wrote:
Hi Richard,
I had a thorough look on your code. I have the following remarks:
- yes, SerializedPage must be clustered and should therefore implement IClusterable (it is already Serializable, it should therefore be okay to change)
- I found two problems with your implementation:
1) unbind() is called during invalidation of a session. getPageStore() will therefore result in a NPE as there is no WebRequest
2) according to the JavaDoc of DiskPageStore#removePage(SessionEntry, String, int) ("page id to remove or -1 if the whole pagemap should be removed") calling removePage(String, String, int) with an id of -1 should delete all pages of a pageMap (however, that's not documented in the JavaDoc of IPageStore!)
- I feel that all pages could be in a single HashMap (rather than using 3 levels of nested HashMaps and HashSets). I therefore implemented my own PageStore based on your ideas to confirm my feelings (using a single HashMap per sesison, using less Hash(Map|Set) iterations; access synchronized using a ReentrantReadWriteLock which I think has quite good performance with TC). Please have a look. We can probably MyTerracottaPageStore.java merge our ideas for best results!
Regards
Stefan
MyTerracottaPageStore.java
|

|
Re: Terracotta integration
Ok,
I have adapted your code in the following ways:
1)
2) There is a configurable limit to the number of pages per page map and page maps are stored separately, this is to combat the problem I found in point 2.
3) I have removed the pagemap from PageKey class.
4) Adapted getPage to fit the api doc.
5) Moved serialization / de-serialization higher up so that dont need to store transient TerracottaPageStore
What do you think? I'm add some debug output code and test it in a clustered environment and will report back.
OurTerracottaPageStore.javaim still not entirly sure about the containsPage method either, it might require iteration through the map because I'm not sure the DEFAULT_AJAX_VERSION_NUMBER approach will work.
Richard
Hi Stefan,
Looking through your code I see a couple of issues:
1) There is no limit on the number of pages stored in the pagemap, pages could get added forever. I feel there needs to be a way to limit the number of pages stored, with oldest ones discarded first. This is how DiskPageStore works.
2) Following on from point 1, a HashMap does not keep insertion order so it is not possible to remove the oldest ones easily. A simple change to LinkedHashMap would solve this and make point 1 easy to implement. However storing all the pagemaps together does mean that the most recent pages from one pagemap could get removed due to high use of another pagemap. In this case when the user goes back to the other pagemap he/she will encounter an exception.
3) Your getPage code is not general enough; from the javadocs for getPage in IPageStore:
* If ajaxVersionNumber is -1 and versionNumber is specified, the page store must return the page with highest ajax version.
* If both versionNumber and ajaxVersioNumber are -1, the pagestore must return last touched (saved) page version with given id.
Your method of constructing a key object wouldn't work in these situations, as it would only find exact matches, and so getPage would require iterating through the entire HashMap and looking at every entry.
This issue is the reason why I went for the nested structure I used. I do agree that a single storage map would ideally be better, especially as this make it easier to better manage the number of pages stored, but i'm not sure if it is the most efficient method of storage for the complex getPage requirements. By efficient i mean execution time rather than memory usage.
Thoughts?
Richard
Stefan Fußenegger wrote:
Hi Richard,
I had a thorough look on your code. I have the following remarks:
- yes, SerializedPage must be clustered and should therefore implement IClusterable (it is already Serializable, it should therefore be okay to change)
- I found two problems with your implementation:
1) unbind() is called during invalidation of a session. getPageStore() will therefore result in a NPE as there is no WebRequest
2) according to the JavaDoc of DiskPageStore#removePage(SessionEntry, String, int) ("page id to remove or -1 if the whole pagemap should be removed") calling removePage(String, String, int) with an id of -1 should delete all pages of a pageMap (however, that's not documented in the JavaDoc of IPageStore!)
- I feel that all pages could be in a single HashMap (rather than using 3 levels of nested HashMaps and HashSets). I therefore implemented my own PageStore based on your ideas to confirm my feelings (using a single HashMap per sesison, using less Hash(Map|Set) iterations; access synchronized using a ReentrantReadWriteLock which I think has quite good performance with TC). Please have a look. We can probably MyTerracottaPageStore.java merge our ideas for best results!
Regards
Stefan
MyTerracottaPageStore.java
|

|
Re: Terracotta integration
1+2) well, it will only add pages as long as the session is alive. if a page isn't used frequently it will be moved to and later persisted by the TC server and finally GCed together with its session. therefore i don't think deleting old pages is necessary. or do you have a special use case where this could be problematic? maybe a bot crawling thousands of pages could generate tons of serialized page? but is this really a problem?
3) okay, didn't see that little piece of javadoc. I think an extra structure keeping track of most recent versions of pageIds could help to make these searches efficient.
I changed my code:
- one store per PageMapName, making deletes more efficient
- version info stored for all pageIds (HashMap<Integer,VersionInfo>) where VersionInfo has a pointer to the most recent page and highest ajaxVersionNumber
Comments?
New file: (untested!) MyTerracottaPageStore.java Hi Stefan,
Looking through your code I see a couple of issues:
1) There is no limit on the number of pages stored in the pagemap, pages could get added forever. I feel there needs to be a way to limit the number of pages stored, with oldest ones discarded first. This is how DiskPageStore works.
2) Following on from point 1, a HashMap does not keep insertion order so it is not possible to remove the oldest ones easily. A simple change to LinkedHashMap would solve this and make point 1 easy to implement. However storing all the pagemaps together does mean that the most recent pages from one pagemap could get removed due to high use of another pagemap. In this case when the user goes back to the other pagemap he/she will encounter an exception.
3) Your getPage code is not general enough; from the javadocs for getPage in IPageStore:
* If ajaxVersionNumber is -1 and versionNumber is specified, the page store must return the page with highest ajax version.
* If both versionNumber and ajaxVersioNumber are -1, the pagestore must return last touched (saved) page version with given id.
Your method of constructing a key object wouldn't work in these situations, as it would only find exact matches, and so getPage would require iterating through the entire HashMap and looking at every entry.
This issue is the reason why I went for the nested structure I used. I do agree that a single storage map would ideally be better, especially as this make it easier to better manage the number of pages stored, but i'm not sure if it is the most efficient method of storage for the complex getPage requirements. By efficient i mean execution time rather than memory usage.
Thoughts?
Richard
Stefan Fußenegger wrote:
Hi Richard,
I had a thorough look on your code. I have the following remarks:
- yes, SerializedPage must be clustered and should therefore implement IClusterable (it is already Serializable, it should therefore be okay to change)
- I found two problems with your implementation:
1) unbind() is called during invalidation of a session. getPageStore() will therefore result in a NPE as there is no WebRequest
2) according to the JavaDoc of DiskPageStore#removePage(SessionEntry, String, int) ("page id to remove or -1 if the whole pagemap should be removed") calling removePage(String, String, int) with an id of -1 should delete all pages of a pageMap (however, that's not documented in the JavaDoc of IPageStore!)
- I feel that all pages could be in a single HashMap (rather than using 3 levels of nested HashMaps and HashSets). I therefore implemented my own PageStore based on your ideas to confirm my feelings (using a single HashMap per sesison, using less Hash(Map|Set) iterations; access synchronized using a ReentrantReadWriteLock which I think has quite good performance with TC). Please have a look. We can probably MyTerracottaPageStore.java merge our ideas for best results!
Regards
Stefan
MyTerracottaPageStore.java
|

|
Re: Terracotta integration
Im still not sure about not limiting the number of pages to keep in session, even DiskPageStore has some sort of limit, imo not having a limit exposes us to the possibility of a single malicious user grinding the system to a halt. Yes terracotta will persist it to disk if needs be, but if that session is is current active use then it will be paging to and from disk all the time.
I would like to get the opinion of some other people about this.
Also I don't see how the the -1 ajax version can work; in disk based store it treats the -1 the same as in getPage, where it just looks for the highest version, in our case it will construct a key with the -1 value in it, i.e. it will only find the page where ajax version number is -1. Since this cant happen, contains page wont work. We could probably use the helper structure thing to simplify this though.
Richard
1+2) well, it will only add pages as long as the session is alive. if a page isn't used frequently it will be moved to and later persisted by the TC server and finally GCed together with its session. therefore i don't think deleting old pages is necessary. or do you have a special use case where this could be problematic? maybe a bot crawling thousands of pages could generate tons of serialized page? but is this really a problem?
3) okay, didn't see that little piece of javadoc. I think an extra structure keeping track of most recent versions of pageIds could help to make these searches efficient.
I changed my code:
- one store per PageMapName, making deletes more efficient
- version info stored for all pageIds (HashMap<Integer,VersionInfo>) where VersionInfo has a pointer to the most recent page and highest ajaxVersionNumber
Comments?
New file: (untested!) MyTerracottaPageStore.java
richardwilko wrote:
Hi Stefan,
Looking through your code I see a couple of issues:
1) There is no limit on the number of pages stored in the pagemap, pages could get added forever. I feel there needs to be a way to limit the number of pages stored, with oldest ones discarded first. This is how DiskPageStore works.
2) Following on from point 1, a HashMap does not keep insertion order so it is not possible to remove the oldest ones easily. A simple change to LinkedHashMap would solve this and make point 1 easy to implement. However storing all the pagemaps together does mean that the most recent pages from one pagemap could get removed due to high use of another pagemap. In this case when the user goes back to the other pagemap he/she will encounter an exception.
3) Your getPage code is not general enough; from the javadocs for getPage in IPageStore:
* If ajaxVersionNumber is -1 and versionNumber is specified, the page store must return the page with highest ajax version.
* If both versionNumber and ajaxVersioNumber are -1, the pagestore must return last touched (saved) page version with given id.
Your method of constructing a key object wouldn't work in these situations, as it would only find exact matches, and so getPage would require iterating through the entire HashMap and looking at every entry.
This issue is the reason why I went for the nested structure I used. I do agree that a single storage map would ideally be better, especially as this make it easier to better manage the number of pages stored, but i'm not sure if it is the most efficient method of storage for the complex getPage requirements. By efficient i mean execution time rather than memory usage.
Thoughts?
Richard
Stefan Fußenegger wrote:
Hi Richard,
I had a thorough look on your code. I have the following remarks:
- yes, SerializedPage must be clustered and should therefore implement IClusterable (it is already Serializable, it should therefore be okay to change)
- I found two problems with your implementation:
1) unbind() is called during invalidation of a session. getPageStore() will therefore result in a NPE as there is no WebRequest
2) according to the JavaDoc of DiskPageStore#removePage(SessionEntry, String, int) ("page id to remove or -1 if the whole pagemap should be removed") calling removePage(String, String, int) with an id of -1 should delete all pages of a pageMap (however, that's not documented in the JavaDoc of IPageStore!)
- I feel that all pages could be in a single HashMap (rather than using 3 levels of nested HashMaps and HashSets). I therefore implemented my own PageStore based on your ideas to confirm my feelings (using a single HashMap per sesison, using less Hash(Map|Set) iterations; access synchronized using a ReentrantReadWriteLock which I think has quite good performance with TC). Please have a look. We can probably MyTerracottaPageStore.java merge our ideas for best results!
Regards
Stefan
MyTerracottaPageStore.java
|

|
Re: Terracotta integration
Ok, i now used a LinkedHashMap and a limit of 1000 pages per PageMap. This should give sufficient protection and rarely happen.
You were right with the -1 ajaxVersionNumber. I fixed that.
I also fixed the reference to the highestAjaxVersion as there needs to be such a reference for each version, not only each pageId. There is now an additional HashMap. So finally this implemenation requires two HashMaps and a LinkedHashMap per PageMap.
For this implementation, I assumed that pages are inserted in order (according to their versions). Could somebody confirm that? Otherwise, the map pointing to the highest ajaxVersion would need to be updated when the currently highest ajaxVersion is deleted due to an exceeded max pages limit (one would have to search for a lower ajaxVersion and point to that page). Otherwise, I'd say we are quite close to the DiskPageStore implementation (not being asynchronous and not implementing ISerializationAwarePageStore - which is only used for Wicket's session clustering, right?)
regards
MyTerracottaPageStore.java Im still not sure about not limiting the number of pages to keep in session, even DiskPageStore has some sort of limit, imo not having a limit exposes us to the possibility of a single malicious user grinding the system to a halt. Yes terracotta will persist it to disk if needs be, but if that session is is current active use then it will be paging to and from disk all the time.
I would like to get the opinion of some other people about this.
Also I don't see how the the -1 ajax version can work; in disk based store it treats the -1 the same as in getPage, where it just looks for the highest version, in our case it will construct a key with the -1 value in it, i.e. it will only find the page where ajax version number is -1. Since this cant happen, contains page wont work. We could probably use the helper structure thing to simplify this though.
Richard
1+2) well, it will only add pages as long as the session is alive. if a page isn't used frequently it will be moved to and later persisted by the TC server and finally GCed together with its session. therefore i don't think deleting old pages is necessary. or do you have a special use case where this could be problematic? maybe a bot crawling thousands of pages could generate tons of serialized page? but is this really a problem?
3) okay, didn't see that little piece of javadoc. I think an extra structure keeping track of most recent versions of pageIds could help to make these searches efficient.
I changed my code:
- one store per PageMapName, making deletes more efficient
- version info stored for all pageIds (HashMap<Integer,VersionInfo>) where VersionInfo has a pointer to the most recent page and highest ajaxVersionNumber
Comments?
New file: (untested!) MyTerracottaPageStore.java
richardwilko wrote:
Hi Stefan,
Looking through your code I see a couple of issues:
1) There is no limit on the number of pages stored in the pagemap, pages could get added forever. I feel there needs to be a way to limit the number of pages stored, with oldest ones discarded first. This is how DiskPageStore works.
2) Following on from point 1, a HashMap does not keep insertion order so it is not possible to remove the oldest ones easily. A simple change to LinkedHashMap would solve this and make point 1 easy to implement. However storing all the pagemaps together does mean that the most recent pages from one pagemap could get removed due to high use of another pagemap. In this case when the user goes back to the other pagemap he/she will encounter an exception.
3) Your getPage code is not general enough; from the javadocs for getPage in IPageStore:
* If ajaxVersionNumber is -1 and versionNumber is specified, the page store must return the page with highest ajax version.
* If both versionNumber and ajaxVersioNumber are -1, the pagestore must return last touched (saved) page version with given id.
Your method of constructing a key object wouldn't work in these situations, as it would only find exact matches, and so getPage would require iterating through the entire HashMap and looking at every entry.
This issue is the reason why I went for the nested structure I used. I do agree that a single storage map would ideally be better, especially as this make it easier to better manage the number of pages stored, but i'm not sure if it is the most efficient method of storage for the complex getPage requirements. By efficient i mean execution time rather than memory usage.
Thoughts?
Richard
Stefan Fußenegger wrote:
Hi Richard,
I had a thorough look on your code. I have the following remarks:
- yes, SerializedPage must be clustered and should therefore implement IClusterable (it is already Serializable, it should therefore be okay to change)
- I found two problems with your implementation:
1) unbind() is called during invalidation of a session. getPageStore() will therefore result in a NPE as there is no WebRequest
2) according to the JavaDoc of DiskPageStore#removePage(SessionEntry, String, int) ("page id to remove or -1 if the whole pagemap should be removed") calling removePage(String, String, int) with an id of -1 should delete all pages of a pageMap (however, that's not documented in the JavaDoc of IPageStore!)
- I feel that all pages could be in a single HashMap (rather than using 3 levels of nested HashMaps and HashSets). I therefore implemented my own PageStore based on your ideas to confirm my feelings (using a single HashMap per sesison, using less Hash(Map|Set) iterations; access synchronized using a ReentrantReadWriteLock which I think has quite good performance with TC). Please have a look. We can probably MyTerracottaPageStore.java merge our ideas for best results!
Regards
Stefan
MyTerracottaPageStore.java
|