David MacIver <david.maciver <at> gmail.com> writes:
> It's about 20% slower on gets
> than the jcl.HashMap implementation and appears to be fractionally
> faster on updates. It's in dictionary.scala in the same repo as the
> rest of the code (
http://www.drmaciver.com/repos/misc , in
> scala/maps)
>
> It's remarkably simple given how well it does. I'm a little surprised.
It makes sense to me. java.util.HashMap is quite simple itself, so simplicity
seems to make the JIT's job easier resulting in better performance. Also,
there's room for improvement in the java.util.HashMap from JDK6. For example,
the modcount field used to throw ConcurrentModificationException is volatile for
no good reason. This has been fixed in the mercurial repo for OpenJDK7. And if
one didn't care for ConcurrentModificationException, then the logic could be
simplified further. The fact that it caches hashCodes does not help if the
hashCode is cheap to compute or is cached in the object itself (so this won't be
an advantage for the current benchmarks that use Strings) and finally it
includes some code (e.g. recordAccess, recordRemoval) that is only there to make
the implementation of LinkedHashMap easier.
Even though I haven't had the time to analyse in detail, it seems to me that
mutable.HashMap is paying the price for too much abstraction. If I had to guess,
I would say that the JIT is failing to inline some calls causing the performance
disparity.
Of course, it doesn't help that an update in mutable.HashMap involves calling
HashTable.findEntry followed by HashTable.addEntry where things like the
hashCode are computed twice, but I don't think should affect the current
benchmarks much (since String caches the hashCode). It seems like an easy
improvement to make though.
I was also curious how much of an effect rehashings were having in the update
scores. So, I set the initial size to be large enough to avoid any rehashings.
For jcl.HashMap, I just passed it as the constructor parameter and it got a nice
improvement (from 375ms to 272 ms). For mutable.HashMap I had to create an
anonymous inner class and override initialSize. To my surprise it was
substantially slower (832ms to 1307ms). Unless there's a bug somewhere, it seems
like the creation of the anonymous inner class is causing even more missed
optimisations. Maybe it's causing some call site to become bimorphic or
megamorphic, but haven't looked at the full benchmark in detail.
Ismael