[scala] Collections performance
So as part of my project to write a replacement immutable HashMap
implementation I've been benchmarking like crazy. I had previously
been comparing to the standard immutable.HashMap implementation but
earlier during a bored moment when I couldn't face any more SVG I
decided to widen the scope of my benchmarks and include a bunch of
other map implementations.
The attached are a set of these bench marks, testing a few common
operations on some standard map implementations. These are mainly
interesting for their comparison of the status quo rather than my
implementation (although I'm reasonably pleased with the numbers on
mine). In particular I compared the standard immutable hash map, the
standard mutable hash map, the standard TreeMap and the JCL HashMap
for mapping Strings to Strings.
The numbers are a little sad. They can be summarised as follows:
- TreeMap is slooow (this might be because string comparison is
relatively expensive, but in some other tests I ran earlier on integer
keys it did pretty terribly as well).
- jcl.HashMap basically blows all the others out of the water -
typically by a factor of about 5.
So all the collections we typically use are substantially slower than
their Java equivalents. This is a little unfortunate.
One other point of interest (not really in this context, as it's
dramatically not what maps are supposed to be for) is that the scala
variants hold their own much more in foreach performance - a lot of
the Scala collections have internal iteration methods which don't use
an iterator, and these tend to be faster (indeed my HashMap's foreach
method is the only instance of the JCL collection losing).
So, basically we're typically using collections that are a lot slower
than they need to be.
This isn't anyone's fault, particularly not the EPFL team's. The
java.util collections have had a lot of time spent on optimising them
over the last N years, and having a blazingly fast collections API has
quite correctly not been the top priority for Scala. But at the same
time, it would be nice to have it fixed. :-) The best way to do this
is probably community involvement (likely via scalax. I should really
get around to sending in a CLA for that...). In the mean time, maybe
we should think about using the jcl collections a bit more (at least
when we need high performance mutable collections) - they're not
really significantly worse to use than the standard mutable
collections API, thanks to the nice higher order wrapper the package
provides around the underlying java.util stuff.
That being said, I'm not really proposing anything concrete in this
email. It's mostly just a heads up to let people know what I've found.
My HashMap: update Normal HashMap: update Mutable HashMap: update TreeMap: update JCL Hashmap: update
--------------------------------------------------------------------------------------------------------------
2443 1526 1622 9104 566
2324 1501 1568 8909 536
--------------------------------------------------------------------------------------------------------------
2436 1485 1560 9106 523
2256 1439 1553 8819 522
2245 1443 1532 8769 519
2236 1453 1540 8787 519
2242 1458 1557 8780 563
2283 1464 1559 8850 543
2300 1465 1557 8893 525
2272 1461 1562 8872 529
2252 1461 1551 8834 522
2488 1623 1552 8881 553
--------------------------------------------------------------------------------------------------------------
2301 ms 1475 ms 1552 ms 8859 ms 532 ms
0 M/s 0 M/s 0 M/s 0 M/s 0 M/s
100 % 155 % 148 % 25 % 432 %
--------------------------------------------------------------------------------------------------------------
My HashMap: get Normal HashMap: get Mutable HashMap: get TreeMap: get JCL HashMap: get
-----------------------------------------------------------------------------------------------
1980 2138 2413 7012 448
1936 2305 2315 6969 408
-----------------------------------------------------------------------------------------------
1937 2275 2396 6945 407
1925 2269 2376 6953 404
1922 2268 2375 6904 402
1908 2263 2379 6905 420
1951 2248 2374 6879 405
1917 2258 2388 6895 404
1918 2250 2367 6917 403
1916 2255 2376 6940 403
1917 2274 2394 6916 412
1939 2250 2384 6881 405
-----------------------------------------------------------------------------------------------
1925 ms 2261 ms 2381 ms 6913 ms 406 ms
0 M/s 0 M/s 0 M/s 0 M/s 0 M/s
100 % 85 % 80 % 27 % 473 %
-----------------------------------------------------------------------------------------------
My HashMap: apply Normal HashMap: apply TreeMap: apply Mutable HashMap: apply JCL HashMap: apply
---------------------------------------------------------------------------------------------------------
1018 1244 3179 1235 257
942 1233 3173 1218 233
---------------------------------------------------------------------------------------------------------
945 1237 3145 1250 241
945 1232 3144 1218 233
939 1232 3143 1223 235
944 1235 3171 1234 233
941 1225 3153 1216 233
941 1258 3139 1232 261
964 1225 3167 1220 232
934 1230 3178 1226 234
951 1246 3147 1232 234
939 1236 3163 1219 232
---------------------------------------------------------------------------------------------------------
944 ms 1236 ms 3155 ms 1227 ms 237 ms
0 M/s 0 M/s 0 M/s 0 M/s 0 M/s
100 % 76 % 29 % 76 % 397 %
---------------------------------------------------------------------------------------------------------
My HashMap: foreach Normal HashMap: foreach Mutable HashMap: foreach JCL HashMap: foreach TreeMap: foreach
-------------------------------------------------------------------------------------------------------------------
258 884 879 360 371
233 841 839 329 348
-------------------------------------------------------------------------------------------------------------------
235 841 843 330 349
234 842 841 332 348
233 850 841 330 349
235 844 839 331 366
252 857 841 327 348
233 839 837 329 349
235 842 837 328 348
232 837 838 329 354
235 780 801 327 346
232 802 800 328 354
-------------------------------------------------------------------------------------------------------------------
236 ms 833 ms 832 ms 329 ms 351 ms
0 M/s 0 M/s 0 M/s 0 M/s 0 M/s
100 % 28 % 28 % 71 % 67 %
-------------------------------------------------------------------------------------------------------------------
My HashMap: elements.foreach Normal HashMap: elements.foreach TreeMap: elements.foreach Mutable HashMap: elements.foreach JCL HashMap: elements.foreach
----------------------------------------------------------------------------------------------------------------------------------------------------------------
460 780 1650 804 338
408 777 1588 803 266
----------------------------------------------------------------------------------------------------------------------------------------------------------------
407 780 1607 808 265
408 781 1580 808 264
406 776 1580 820 265
408 777 1579 808 266
408 784 1578 807 264
408 777 1584 807 265
407 776 1577 812 266
406 778 1576 806 265
407 778 1582 809 264
408 777 1564 805 264
----------------------------------------------------------------------------------------------------------------------------------------------------------------
407 ms 778 ms 1581 ms 809 ms 265 ms
0 M/s 0 M/s 0 M/s 0 M/s 0 M/s
100 % 52 % 25 % 50 % 153 %
----------------------------------------------------------------------------------------------------------------------------------------------------------------