Adding Bagwell's Trie to the collection: benchmarks

View: New views
7 Messages — Rating Filter:   Alert me  

Adding Bagwell's Trie to the collection: benchmarks

by Michel Salim-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Last year, Martin mentioned the possibility of adding Phil Bagwell's
Hash Array Mapped Trie to the Scala collection[1]:

> Ironically, Clojure's maps are based on work which Phil
> Bagwell did as an associate of my group at EPFL.
> We never put that into the Scala collection libraries
> (Phil being a C++ person), but now we have new
> motivation to do so.

Is there work being done in this direction for Scala 2.8 (or beyond)?

As a motivation, here are some benchmark numbers from my Scala
implementation of Mini Kanren.

Timing results obtained by computing 6-digit palindromes that are the
result of multiplying two 3-digit numbers (first set of results only).
The implementation of substitutions is changed for each run:

- substitution = association list (Scala list of (Key,Value) pairs):
114562 111629 109939

- substitution = chained case classes
47833 44813 44277 (I've gotten low 43 thousands too)

- substitution = Scala immutable map
OOM: GC overhead limit exceeded

- substitution = Clojure PersistentHashMap[2]
17955 15586 13482 (average 15674)

As a comparison, using Petite Chez Scheme on the same machines, and
the built-in Scheme association lists:

39312 38853 39207

[1] http://www.nabble.com/Re%3A-Tail-calls-via-trampolining-and-an-explicit-instruction-p20706577.html

[2] Does not work out of the box with Scala, because Clojure's
IMapEntry interface uses val() as its getter, clashing with the
Scala keyword. Get patched source from

  http://github.com/hircus/clojure/tree/rename-val

Thanks,

--
Michel Alexandre Salim

Re: Adding Bagwell's Trie to the collection: benchmarks

by Ken Bloom-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sun, 25 Oct 2009 15:30:02 -0400, Michel Alexandre Salim wrote:

> - substitution = Clojure PersistentHashMap[2] 17955 15586 13482 (average
> 15674)

> [2] Does not work out of the box with Scala, because Clojure's IMapEntry
> interface uses val() as its getter, clashing with the Scala keyword. Get
> patched source from
>
>   http://github.com/hircus/clojure/tree/rename-val
>
> Thanks,

Or just escape it using backticks.



--
Chanoch (Ken) Bloom. PhD candidate. Linguistic Cognition Laboratory.
Department of Computer Science. Illinois Institute of Technology.
http://www.iit.edu/~kbloom1/


Parent Message unknown Re: Adding Bagwell's Trie to the collection: benchmarks

by Michel Salim-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Mon, Oct 26, 2009 at 3:47 AM, Florian Hars <fh+scala@...> wrote:
> Michel Alexandre Salim schrieb:
>> - substitution = Scala immutable map
>> OOM: GC overhead limit exceeded
>
> Have you used just Map (which uses a HashMap-with-deltas approach
> optimized for non-persistent use of persistent maps), or did you use
> a TreeMap directly?
>
I used both Map and TreeHashMap.. Map/HashMap gave an OOM due to GC
limits being reached; TreeHashMap (which looked the closest to what
Clojure used) gave an OOM due to heap exhaustion, but took longer to
get there -- longer than the solution with Clojure maps took to
finish.

TreeMap uses a Red,Black tree inside, from what I see -- it's not even
an option; the keys in my map are not meant to be sortable (they are
"logic variables", i.e. gensym-ed symbols.

Regards,
--
Michel Alexandre Salim

Re: Adding Bagwell's Trie to the collection: benchmarks

by Ismael Juma :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Michel Alexandre Salim <michael.silvanus <at> gmail.com> writes:
> Is there work being done in this direction for Scala 2.8 (or beyond)?

Yes, there is. Last time Tiark posted about it, he said that the immutable
vector would be there before the beta, but not the map(s). The immutable vector
landed recently, so seems like things are going according to plan so far.

Best,
Ismael


Re: Re: Adding Bagwell's Trie to the collection: benchmarks

by Michel Salim-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Mon, Oct 26, 2009 at 7:28 AM, Ismael Juma <mlists@...> wrote:
> Michel Alexandre Salim <michael.silvanus <at> gmail.com> writes:
>> Is there work being done in this direction for Scala 2.8 (or beyond)?
>
> Yes, there is. Last time Tiark posted about it, he said that the immutable
> vector would be there before the beta, but not the map(s). The immutable vector
> landed recently, so seems like things are going according to plan so far.
>
Great to hear, thanks! I'll try and track the 2.8 tree and test it
once it lands.

I suppose we can't use the Clojure code, since it's more strictly
licensed (EPL rather than BSD) -- and, alas, the internal interfaces
are different anyway.

Thanks,

--
Michel Alexandre Salim

Re: Adding Bagwell's Trie to the collection: benchmarks

by Ismael Juma :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Michel Alexandre Salim <michael.silvanus <at> gmail.com> writes:
> I suppose we can't use the Clojure code, since it's more strictly
> licensed (EPL rather than BSD) -- and, alas, the internal interfaces
> are different anyway.

As I understand, Tiark is working with Phil on improved versions so that's
another reason too.

Best,
Ismael


Re: Adding Bagwell's Trie to the collection: benchmarks

by Ben Jackman :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

In the meantime before 2.8 stabilized, I know that Daniel Spiewak reimplemented the Clojure PersistentHashMap in Scala, this might be useful to you for the time being:

http://github.com/djspiewak/scala-collections

Caveat Emptor: I haven't tried it out so I am not sure it works correctly / what the performance is like.

Ben

Ismael Juma wrote:
Michel Alexandre Salim <michael.silvanus <at> gmail.com> writes:
> I suppose we can't use the Clojure code, since it's more strictly
> licensed (EPL rather than BSD) -- and, alas, the internal interfaces
> are different anyway.

As I understand, Tiark is working with Phil on improved versions so that's
another reason too.

Best,
Ismael