|
View:
New views
7 Messages
—
Rating Filter:
Alert me
|
|
|
Performance questionHello everyone,
I have a quick question about how to optimize some Glazedlists code. I have reduced a much more complex network of lists from my real application to this simple example, that has a source list, wrapped in a grouping list that puts values in buckets, and a function list to calculate the average of each bucket: EventList<Double> sourceList = new BasicEventList<Double>(); GroupingList<Double> groupingList = new GroupingList<Double>(sourceList, new BucketComparator()); FunctionList<List<Double>, Double> functionList = new FunctionList<List<Double>, Double>(groupingList, new AverageFunction()); A contrived example but it shows my problem, which is that I am incurring the O(n) overhead of calculating the average price on each insert, where as there is obviously an O(1) update function for an average that I should be using. The question is, how can I rewrite this for better performance? For reference, I have pasted in the full text of a class that exercises my problem below. Thanks in advance for any help. graham /////////// Sample code import java.util.Comparator; import java.util.List; import ca.odell.glazedlists.BasicEventList; import ca.odell.glazedlists.EventList; import ca.odell.glazedlists.FunctionList; import ca.odell.glazedlists.GroupingList; import ca.odell.glazedlists.FunctionList.Function; public class FunctionListTest { /** * @param args */ public static void main(String[] args) { EventList<Double> sourceList = new BasicEventList<Double>(); GroupingList<Double> groupingList = new GroupingList<Double>(sourceList, new BucketComparator()); FunctionList<List<Double>, Double> functionList = new FunctionList<List<Double>, Double>(groupingList, new AverageFunction()); long now = System.currentTimeMillis(); for (int i = 0; i < 100000; i++){ sourceList.add((double)i); if (i % 1000 ==0){ for (Double double1 : functionList) { System.out.println(i+": "+double1); } System.out.println("Took: "+((System.currentTimeMillis()-now)/1000.0)); now = System.currentTimeMillis(); } } } static class BucketComparator implements Comparator<Double>{ int MAX = 25; int BUCKET_SIZE = 5; public int compare(Double arg0, Double arg1) { if (arg0 > MAX && arg1 > MAX){ return 0; } else { return (((int)(double)arg0)/BUCKET_SIZE) - (((int)(double)arg1)/BUCKET_SIZE); } } } public static class AverageFunction implements Function<List<Double>, Double> { public Double evaluate(List<Double> arg0) { int size = arg0.size(); if (size == 0){ return 0.0; } else{ double total = 0.0; for (Double double1 : arg0) { total += double1; } return total/size; } } } } |
|
|
Re: Performance questionGraham,
The crux of your slowness is that GroupingList doesn't create EventLists as its buckets, but regular Lists. We've known about this "problem" for some time, but haven't really considered a solution much. (GroupingList is already too complicated, and needs to be rethought). What I *can* tell you is this: there is a new extension within GL called "calculations" found in the "ca.odell.glazedlists.calculation" package. It contains work which computes calculations or statistics about an EventList<? extends Number>. At the moment it includes sums, counts, division, and mean average. It will be extended, as needed by the community, to include any other useful calculations. (perhaps mode and median averages as well, for example). I don't know how much you actually need the GroupingList, becuase I don't know your expected number of buckets in the real world. If it's small, I'd suggest using several FilterLists to compute the buckets rather than a single GroupingList. The drawback is, of course, that there are more EventLists and more processor time spent evaluating ListEvents, but the FilterLists will preserve the fine-grained ListEvents so that you can compute Calculations efficiently. You use it like this: final EventList<Float> source = new BasicEventList<Float>(); source.add(1f); final Calculation<Float> mean = Calculations.meanFloats(source); mean.getValue (); // returns 1.0f source.add(2f) mean.getValue(); // returns 1.5f If need be, you can combine Calculations to produce more complex calculations using a subclass of AbstractCompositeCalculation. Division is, for example, a Calculation composed of one Calculation representing the numerator, and another for the denominator. When either of those Calculations reports a value change, the Division Calculation is also recomputed using the new values. If you absolutely need to use a GroupingList to create a large number of buckets, (say anywhere north of 10ish), then you probably need to consider computing the averages "outside" of the GroupingList using your own ListEventListener where you can take advantage of the fine-grainedness of ListEvents. Take a look at how any of the Calculations I mentioned above actually work. You'll see how they recompute their Calculations by only consider the actual List deltas. Don't be afraid to ask for more help if you need it. And if you can using the Calculation stuff as is, even better... you can ask for new Calculations that are currently missing and that make sense in the GL core. James On Jan 23, 2008 4:23 PM, gm-mrktc <gmiller@...> wrote:
|
|
|
Re: Performance questionJames,
Thanks. I don't think the filter list approach is going to work for me, because in my real application, the concept of buckets is much more dynamic, and based on attributes of the list elements themselves. The calculations stuff is cool, and potentially useful to me in other places. However, I think that you are correct in assessing that the limitations on GroupingList is going to be our biggest problem. If I may make one observation about the Calculations package. I don't see any reason why AbstractCalculation needs to parameterize on "N extends Number", rather than just N. This may be true about the two subclasses as well, tho it's less clear. Not restricting the parameter would make it more flexible, specifically I can think of at least one situation in which I would want to use AbstractCalculation for something other than a number. Thanks, and let me know if you have any other thoughts. graham
|
|
|
Re: Performance questionGraham,
Can you explain what you mean by "concept of buckets is much more dynamic, and based on attributes of the list elements themselves"... perhaps there is an opportunity to add a new TransformedList into the GL core that solves your problem? I'm intrigued, as typically, a programmer or user effectively decides how filtering or grouping works... I haven't encountered a scenario when filters or groups change based on the data itself. Also, I guess we could relax <N extends Number> to just <N>... I hadn't considered that level of abstraction, but I think it makes sense. James On Jan 24, 2008 1:30 PM, gm-mrktc <
gmiller@...> wrote:
|
|
|
Re: Performance questionJames,
Sure, I'd be happy to explain. My list elements are essentially maps (key-value pair objects). Looking at the two of the values in the map (symbol and side), is how I form the groups. So, for example all of the "BUY" orders for "IBM" would go in the same bucket. All I meant by "more dynamic" was that the buckets are not pre-determined, in that I don't know ahead of time the values of all of the possible symbols. Thanks. graham
|
|
|
Re: Performance questionIn the end, what I ended up doing was to remove the grouping list entirely from the system.
I then created a custom subclass of AbstractEventList that implements ListEventListener, and stores the buckets explicitly in a hashtable. I can't say I'm terribly satisfied with it from a stylistic point of view. But it definitely appears to perform better, and all of my unit tests pass, so it's what I'm going with for now. Thanks for your help James. graham
|
|
|
Re: Performance questionGraham,
Two things that might make your implementation better: 1) a TransformedList is exactly an AbstractEventList that implements ListEventListener so TransformedList would probably be a better base class for you. 2) If you're storing things into buckets yourself, you might be able to leverage the code behind GlazedLists.syncEventListToMultiMap(...). Even if you can't use it directly out of the box, you might be able to lift part of the implementation and customize it for your own purposes. James On Jan 29, 2008 6:54 PM, gm-mrktc <gmiller@...> wrote:
|
| Free embeddable forum powered by Nabble | Forum Help |