Are the faster versions of HashMap and BigInteger going into jdk7?

View: New views
10 Messages — Rating Filter:   Alert me  

Are the faster versions of HashMap and BigInteger going into jdk7?

by i30817 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Just a question.

Re: Are the faster versions of HashMap and BigInteger going into jdk7?

by gnu_andrew :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

2009/10/23 Paulo Levi <i30817@...>:
> Just a question.
>

The only relevant changes I see are:

changeset:   1246:8d2efec31d78
user:        xlu
date:        Sun May 24 16:29:57 2009 -0700
summary:     6622432: RFE: Performance improvements to java.math.BigDecimal

http://hg.openjdk.java.net/jdk7/jdk7/jdk/rev/8d2efec31d78

and

changeset:   48:0487ce0465d6
user:        martin
date:        Mon Mar 10 23:23:48 2008 -0700
summary:     6625725: (coll) modCount should not be volatile

http://hg.openjdk.java.net/jdk7/jdk7/jdk/rev/0487ce0465d6

I presume there are more...?
--
Andrew :-)

Free Java Software Engineer
Red Hat, Inc. (http://www.redhat.com)

Support Free Java!
Contribute to GNU Classpath and the OpenJDK
http://www.gnu.org/software/classpath
http://openjdk.java.net

PGP Key: 94EFD9D8 (http://subkeys.pgp.net)
Fingerprint: F8EF F1EA 401E 2E60 15FA  7927 142C 2591 94EF D9D8

Re: Are the faster versions of HashMap and BigInteger going into jdk7?

by i30817 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Re: Are the faster versions of HashMap and BigInteger going into jdk7?

by gnu_andrew :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

2009/10/23 Paulo Levi <i30817@...>:
> Yes. Different implementations with better performance.
>
> http://gcc.gnu.org/ml/java/2009-06/msg00019.html
>

6622432 is the one of the ones I just pointed to i.e. it's in JDK7.
If Alan has a further patch and hasn't even submitted it for
inclusion, it's obviously not in.

> http://www.mail-archive.com/core-libs-dev@.../msg02147.html

I believe Doug still has a big merge of stuff coming up (including
concurrency updates):

http://openjdk.java.net/projects/jdk7/milestones/

>



--
Andrew :-)

Free Java Software Engineer
Red Hat, Inc. (http://www.redhat.com)

Support Free Java!
Contribute to GNU Classpath and the OpenJDK
http://www.gnu.org/software/classpath
http://openjdk.java.net

PGP Key: 94EFD9D8 (http://subkeys.pgp.net)
Fingerprint: F8EF F1EA 401E 2E60 15FA  7927 142C 2591 94EF D9D8

Re: Are the faster versions of HashMap and BigInteger going into jdk7?

by i30817 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I don't think it's the same. 6622432 appears to be programming
optimizations for the existing BigInteger. Alan Eliasen version is one
that uses lower complexity algorithms for the larger numbers.

If hashmap is replaced by a more efficient version, that would be
probably a massive win in some real world programs. The entries
apparently are a hotspot for the gc in some utilizations of hashmap.

Re: Are the faster versions of HashMap and BigInteger going into jdk7?

by Ismael Juma :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Paulo,

Paulo Levi <i30817@...> writes:
> http://www.mail-archive.com/core-libs-dev-0nJqbsLSQw0FDOXUYO6UHQ <at>
public.gmane.org/msg02147.html
 
>From the email you mention:

"I'm not convinced that they are enough better to commit."

To me that sounds like it's not going in at all.

Ismael


Re: Are the faster versions of HashMap and BigInteger going into jdk7?

by Doug Lea :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Ismael Juma wrote:
> [... HashMaps...]
>
>>From the email you mention:
>
> "I'm not convinced that they are enough better to commit."
>
> To me that sounds like it's not going in at all.
>


Although at the end of that mail
(http://mail.openjdk.java.net/pipermail/core-libs-dev/2009-July/002168.html)
I mentioned that there are several less dramatic
changes to the current implementation that seem worthwhile.
In case anyone is interested, a preliminary version with those
changes, that I put together shortly afterward is at
http://gee.cs.oswego.edu/dl/wwwtmp/HashMapV6.java
One of these days/weeks/months I hope to re-check for further
possible improvements and consider committing.

-Doug

Re: Are the faster versions of HashMap and BigInteger going into jdk7?

by Alex Yakovlev :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Doug,

On Sat, Oct 24, 2009 at 2:43 AM, Doug Lea <dl@...> wrote:
> Although at the end of that mail
> (http://mail.openjdk.java.net/pipermail/core-libs-dev/2009-July/002168.html)
> I mentioned that there are several less dramatic
> changes to the current implementation that seem worthwhile.
> In case anyone is interested, a preliminary version with those
> changes, that I put together shortly afterward is at
> http://gee.cs.oswego.edu/dl/wwwtmp/HashMapV6.java
> One of these days/weeks/months I hope to re-check for further
> possible improvements and consider committing.

I have an idea how to improve both yours and mine previous efforts.
I've got no code to show but the idea is to have 3 arrays:

Object[] keyValueTable with key/value mappings.
It's size is 2*capacity so it's based on open hash approach,
and it's lazy-initialized.

int[] indexTable with complex indices. Lowest bits contains next index,
in the middle are hashCode bits, and there's one bit flag indicating that this
cell is from another hash chain. We don't need 'end of list' bit flag -
it can be encoded by having next index bits equal to this cell index.

Entry<K,V>[] overflowTable with HashEntry-based linked list for overflow.
It will be used only for very large maps, or maybe for small too with
load factor >=1,
what do you think? This will be faster (less memory lookups) than storing
all overflow data in a single trie.

In get method we first look into keyValueTable. If it's key part is
null - return null,
if it's referentially equals to given key - return stored value, thus
we'll have only one
random memory lookup. Else we look indexTable, if it's 'another chain'
bit is set -
return null, else compare hash bits, if equals - compare key calling equals.
If they are not equal - check next index until it's equal to current
indicating end-of-list.
If overflowTable is not null also check it (it's like table current HashMap).

By storing next index in open hash approach we'll get linked hash performance
(no 'false' lookups) on gets (but put operation will be slower than in
pure linked hash).
We can use adjacent cells which gives several percent gain as I wrote before.
Defragmentation trick cannot be used in open map approach (and it results
in too complex code) but we can use not only next free index but also previous,
or maybe several steps forward/backward that could be prefetched. This should
be tested on different platforms to figure out optimal behavior.

In put operation we may need to move out other entries from other hash chains,
but we have no space to store previous index to have double linked list,
so we'll need to check possible previous indices - either adjacent or using
some large step or some inversible function - here I'll strongle need
your advice.

Then to insert new key we first check adjacent cell, fill it if it's empty,
it now - take a big step, either fixed or some function-based,
fill it if empry, take another step if not, etc. We can also limit number
of possible steps after which we'll use overflowTable. And if current hash load
is high enough or arrays sizes is at maximum we'll use overflowTable anyway.

To iterate over elements we first iterate over keyValueTable then
overflowTable if it's not null.

To have HashSet use less memory we can have a flag indicating that keyValueTable
contains only keys, but probably we should check if it does not hurt
performance.

For Linked map we'll add two double-element arrrays, one with
next/previous index,
another (initialized only with overflowTable) with pointer to
next/prev HashEntries.
Also in Entry object we'll need to add two index fields and two pointers to link
either to main arrays or to overflow Entry object.

Do you think it's worth trying? This way we'll have the best of your
open-hash-based
previous approach and mine complex indices + using adjacent cells for
prefetching,
having no problem with big load factors and very large maps with
better speed than
single trie for everything needing ~30 random memory lookups (one for each bit).

Alex

Re: Are the faster versions of HashMap and BigInteger going into jdk7?

by Alan Eliasen :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On 10/23/2009 11:01 AM, Andrew John Hughes wrote:
> 6622432 is the one of the ones I just pointed to i.e. it's in JDK7.
> If Alan has a further patch and hasn't even submitted it for
> inclusion, it's obviously not in.

    I had just queried Joe Darcy at Sun about reviewing my patches for
Karatsuba and Toom-Cook multiplication in BigInteger again last week.
His reply was:

    "Yes, I apologize again for the repeated delays; I want to get your
Karatsuba work reviewed and integrated within this JDK 7 milestone (the
next few weeks)."

     So I'm hopeful.  So far, I haven't heard of any substantive issues
with the patches.  They were mostly stylistic.

     As I've stated before, I also have patches that greatly improve the
performance of pow() and toString() (by orders of magnitude) that I have
been holding on to for a long time, and hoping to submit after the
multiplication patches are approved.  (I was asked by Joe Darcy to
submit in smaller patches.)  Hopefully these might get in JDK 7 too.

--
   Alan Eliasen
   eliasen@...
   http://futureboy.us/

Re: Are the faster versions of HashMap and BigInteger going into jdk7?

by gnu_andrew :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

2009/10/27 Alan Eliasen <eliasen@...>:

> On 10/23/2009 11:01 AM, Andrew John Hughes wrote:
>>
>> 6622432 is the one of the ones I just pointed to i.e. it's in JDK7.
>> If Alan has a further patch and hasn't even submitted it for
>> inclusion, it's obviously not in.
>
>   I had just queried Joe Darcy at Sun about reviewing my patches for
> Karatsuba and Toom-Cook multiplication in BigInteger again last week. His
> reply was:
>
>   "Yes, I apologize again for the repeated delays; I want to get your
> Karatsuba work reviewed and integrated within this JDK 7 milestone (the next
> few weeks)."
>
>    So I'm hopeful.  So far, I haven't heard of any substantive issues with
> the patches.  They were mostly stylistic.
>
>    As I've stated before, I also have patches that greatly improve the
> performance of pow() and toString() (by orders of magnitude) that I have
> been holding on to for a long time, and hoping to submit after the
> multiplication patches are approved.  (I was asked by Joe Darcy to submit in
> smaller patches.)  Hopefully these might get in JDK 7 too.
>
> --
>  Alan Eliasen
>  eliasen@...
>  http://futureboy.us/
>

Alan,

I presume this was in private e-mail.  Why was it not just discussed
on this list?  For one thing, you wouldn't then have to reiterate the
conversation!
--
Andrew :-)

Free Java Software Engineer
Red Hat, Inc. (http://www.redhat.com)

Support Free Java!
Contribute to GNU Classpath and the OpenJDK
http://www.gnu.org/software/classpath
http://openjdk.java.net

PGP Key: 94EFD9D8 (http://subkeys.pgp.net)
Fingerprint: F8EF F1EA 401E 2E60 15FA  7927 142C 2591 94EF D9D8