« Return to Thread: [scala] Collections performance

Re: [scala] Collections performance

by David MacIver :: Rate this Message:

Reply to Author | View in Thread

On Wed, Jul 30, 2008 at 4:30 AM, Sean McDirmid <sean.mcdirmid@...> wrote:
> I did similar benchmarking last year and reported results on the internal
> dev list. JCL is definitely faster (those pesky Sun programmers), and fills
> in some of the wholes of the current collection library (views, subsets,
> before recently linked sets and maps). Whereas the Scala collection library
> is biased to immutable APIs and implementations, the JCL is biased to the
> mutable case and so it should be faster.

Right. The fact that the immutable collections library is slower than
the mutable one doesn't surprise me. But the mutable case also seems
substantially slower.

I have just noticed that the fix replacing Array[Entry] with an
Array[AnyRef] in the HashMap implementation hasn't made it into the
version I'm testing against, so it's possible the numbers will be a
little happier for free in the next version.

> On the other hand, JCL is not portable to .NET, so I'm trying to move my
> code away from it, and I would definitely love to see a faster Scala

I wonder if it would be worth filling in the gaps in the standard
mutable collection library so that it supports the same interface as
the JCL collections and then using the JCL collections behind the
scenes when running on the JVM?

> collection library eventually (although what we have now definitely works,
> its just very minimal). Maybe we could we port open source collection
> libraries to Scala? I use C5 in .NET land, and it has a great feature set
> that seems pretty efficient.

Could do. One of the problems with that, at least if we start from a
.NET or Java library, is that it will most likely enrich the mutable
side of the of the collections API while leaving the immutable side to
languish.

One good source of immutable collections to steal is Haskell, and the
stuff on Hackage. :-) e.g. the IntMap implementation I've used is a
direct port of Data.IntMap in Haskell.

> Anyways, I'm all for a scalax collection library, I'd also be willing to
> contribute.

One idea I've toyed with is at some point doing a "Data structures
week". Set up a wiki to coordinate and suggest ideas, and have
everyone who's interested pick a cool/useful data structure and put
together as solid an implementation of it in Scala as they can.
Ideally with a set of scalacheck tests and benchmarks for them to work
against. Participants would be encouraged to submit them to scalax or
similar afterwards, but requiring the CLA up front would probably
limit participation.

 « Return to Thread: [scala] Collections performance