Proposal for improving performance of TreeMap and others

View: New views
10 Messages — Rating Filter:   Alert me  

Proposal for improving performance of TreeMap and others

by cowwoc :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I noticed that TreeMap (and maybe other classes) require a user to either pass in a Comparator or ensure that all keys must implement Comparable. The TreeMap code then uses a utility method whenever it needs to compare two keys:


/**
 * Compares two keys using the correct comparison method for this TreeMap.
 */
final int compare(Object k1, Object k2) {
  return comparator == null ? ((Comparable<? super  K>) k1)
  .compareTo((K) k2) : comparator.compare((K) k1, (K) k2);
}

The problem with the above method is that it checks whether comparator is null once per comparison instead of once when the TreeMap is constructed. Instead I propose that this check only take place once in the constructors and the rest of the code assume that a comparator exists. If a comparator is not provided then you can simply define one as follows:

comparator = new Comparator<K>()
    {
      @SuppressWarnings("unchecked")
      public int compare(K first, K second)
      {
        return ((Comparable<K>) first).compareTo(second);
      }
    });

This solution should be backwards compatible while improving performance. At least, that's my guess. There is always the chance that the JIT is smart enough to optimize away this comparison but I'd rather not rely on JIT implementation details. I also believe the resulting code is more readable.

What do you think?

Gili

Re: Proposal for improving performance of TreeMap and others

by Bugzilla from linuxhippy@gmail.com :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> This solution should be backwards compatible while improving performance. At
> least, that's my guess. There is always the chance that the JIT is smart
> enough to optimize away this comparison but I'd rather not rely on JIT
> implementation details. I also believe the resulting code is more readable.
>
> What do you think?

>From the performance-overview theres no (real) difference I guess, but
I have to agree that the code is more readable and cleaner.
On the other hand its one more class that has to be shipped and loaded
at startup.
I like your approach, I just don't know the real pros and cons...

lg Clemens

Re: Proposal for improving performance of TreeMap and others

by Thomas Hawtin-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

cowwoc wrote:
> I noticed that TreeMap (and maybe other classes) require a user to either
> pass in a Comparator or ensure that all keys must implement Comparable. The
> TreeMap code then uses a utility method whenever it needs to compare two
> keys:

I'm not going to comment about performance, but there is a problem with
serialisation.

TreeMap.comparator is final (and non-transient).

TreeMaps serialised with earlier versions will be deserialised with null
comparator. So, comparator would either need to be made non-final or
sun.misc.Unsafe used.

For the serialisation case, it would be necessary to change writeObject
to use putFields rather than defaultWriteObject (not very nice, but not
half as nasty as I originally thought).

Tom Hawtin

Re: Proposal for improving performance of TreeMap and others

by Martin Buchholz :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

The authors of TreeMap have thought about
eliding comparator null checks:


    /**
     * Version of getEntry using comparator. Split off from getEntry
     * for performance. (This is not worth doing for most methods,
     * that are less dependent on comparator performance, but is
     * worthwhile here.)
     */
    final Entry<K,V> getEntryUsingComparator(Object key) {
        K k = (K) key;
        Comparator<? super K> cpr = comparator;
        if (cpr != null) {
            Entry<K,V> p = root;
            while (p != null) {
                int cmp = cpr.compare(k, p.key);
                if (cmp < 0)
                    p = p.left;
                else if (cmp > 0)
                    p = p.right;
                else
                    return p;
            }
        }
        return null;
    }

As to whether using an explicit Comparator for the "natural ordering"
is a performance improvement, this depends very much on
the implementation of the JIT and the degree of polymorphism of
the call site, and on the prevalance of TreeMaps using "natural
ordering".  At the very least, a null check is very cheap, so it is
unlikely that the proposed change will be a significant performance
improvement, while, on the other hand, there is a good chance that
it will decrease performance for TreeMaps using "natural ordering".

Aside: It's probably a good idea for the comparator for
"natural ordering" to be available via some static method.

Martin


cowwoc wrote:

> I noticed that TreeMap (and maybe other classes) require a user to either
> pass in a Comparator or ensure that all keys must implement Comparable. The
> TreeMap code then uses a utility method whenever it needs to compare two
> keys:
>
>
> /**
>  * Compares two keys using the correct comparison method for this TreeMap.
>  */
> final int compare(Object k1, Object k2) {
>   return comparator == null ? ((Comparable<? super  K>) k1)
>   .compareTo((K) k2) : comparator.compare((K) k1, (K) k2);
> }
>
> The problem with the above method is that it checks whether comparator is
> null once per comparison instead of once when the TreeMap is constructed.
> Instead I propose that this check only take place once in the constructors
> and the rest of the code assume that a comparator exists. If a comparator is
> not provided then you can simply define one as follows:
>
> comparator = new Comparator<K>()
>     {
>       @SuppressWarnings("unchecked")
>       public int compare(K first, K second)
>       {
>         return ((Comparable<K>) first).compareTo(second);
>       }
>     });
>
> This solution should be backwards compatible while improving performance. At
> least, that's my guess. There is always the chance that the JIT is smart
> enough to optimize away this comparison but I'd rather not rely on JIT
> implementation details. I also believe the resulting code is more readable.
>
> What do you think?

Re: Proposal for improving performance of TreeMap and others

by cowwoc :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I guess you're right. It is probably as likely that the JIT will optimize away the null check as it is that it will optimize away the NullPointerException check. One exception, though, is when production systems run using -Xverify:none. In such a case, wouldn't my approach run faster?

I still think that my proposed code is somehow more consistent/cleaner on a design-level but I guess that's just me :)

As an aside, are there standard benchmarks for testing the impact of this change? I'd love to know whether it actually produces any performance difference in practice.

Gili


Martin Buchholz wrote:
The authors of TreeMap have thought about
eliding comparator null checks:


    /**
     * Version of getEntry using comparator. Split off from getEntry
     * for performance. (This is not worth doing for most methods,
     * that are less dependent on comparator performance, but is
     * worthwhile here.)
     */
    final Entry<K,V> getEntryUsingComparator(Object key) {
        K k = (K) key;
        Comparator<? super K> cpr = comparator;
        if (cpr != null) {
            Entry<K,V> p = root;
            while (p != null) {
                int cmp = cpr.compare(k, p.key);
                if (cmp < 0)
                    p = p.left;
                else if (cmp > 0)
                    p = p.right;
                else
                    return p;
            }
        }
        return null;
    }

As to whether using an explicit Comparator for the "natural ordering"
is a performance improvement, this depends very much on
the implementation of the JIT and the degree of polymorphism of
the call site, and on the prevalance of TreeMaps using "natural
ordering".  At the very least, a null check is very cheap, so it is
unlikely that the proposed change will be a significant performance
improvement, while, on the other hand, there is a good chance that
it will decrease performance for TreeMaps using "natural ordering".

Aside: It's probably a good idea for the comparator for
"natural ordering" to be available via some static method.

Martin


cowwoc wrote:
> I noticed that TreeMap (and maybe other classes) require a user to either
> pass in a Comparator or ensure that all keys must implement Comparable. The
> TreeMap code then uses a utility method whenever it needs to compare two
> keys:
>
>
> /**
>  * Compares two keys using the correct comparison method for this TreeMap.
>  */
> final int compare(Object k1, Object k2) {
>   return comparator == null ? ((Comparable<? super  K>) k1)
>   .compareTo((K) k2) : comparator.compare((K) k1, (K) k2);
> }
>
> The problem with the above method is that it checks whether comparator is
> null once per comparison instead of once when the TreeMap is constructed.
> Instead I propose that this check only take place once in the constructors
> and the rest of the code assume that a comparator exists. If a comparator is
> not provided then you can simply define one as follows:
>
> comparator = new Comparator<K>()
>     {
>       @SuppressWarnings("unchecked")
>       public int compare(K first, K second)
>       {
>         return ((Comparable<K>) first).compareTo(second);
>       }
>     });
>
> This solution should be backwards compatible while improving performance. At
> least, that's my guess. There is always the chance that the JIT is smart
> enough to optimize away this comparison but I'd rather not rely on JIT
> implementation details. I also believe the resulting code is more readable.
>
> What do you think?

Re: Proposal for improving performance of TreeMap and others

by Bugzilla from linuxhippy@gmail.com :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi cowwoc,

> I guess you're right. It is probably as likely that the JIT will optimize
> away the null check as it is that it will optimize away the
> NullPointerException check. One exception, though, is when production
> systems run using -Xverify:none. In such a case, wouldn't my approach run
> faster?
I don't think it will optimize the null-check away, however it is so
cheap that it most likely will not weight at all, compared to all the
other operations happening there. Its maybe 5 instructions compared to
thousands or even more.
-Xverify:none only disables bytecode verification at class-loading
time and has no influence (as far as I know) on the performance of the
generated code.

> I still think that my proposed code is somehow more consistent/cleaner on a
> design-level but I guess that's just me :)
I also like it more, its cleaner in my opinion :)

> As an aside, are there standard benchmarks for testing the impact of this
> change? I'd love to know whether it actually produces any performance
> difference in practice.
>From my experience i would rather guess that you won't notice the
change, noise will be higher.

lg Clemens

Re: Proposal for improving performance of TreeMap and others

by Remi Forax :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Clemens Eisserer a écrit :

> Hi cowwoc,
>
>  
>> I guess you're right. It is probably as likely that the JIT will optimize
>> away the null check as it is that it will optimize away the
>> NullPointerException check. One exception, though, is when production
>> systems run using -Xverify:none. In such a case, wouldn't my approach run
>> faster?
>>    
> I don't think it will optimize the null-check away,
Hotspot  removes nullcheck and install a signal handler since its v2
(around 2000/01 If my memory serves me well).

> however it is so
> cheap that it most likely will not weight at all, compared to all the
> other operations happening there. Its maybe 5 instructions compared to
> thousands or even more.
> -Xverify:none only disables bytecode verification at class-loading
> time and has no influence (as far as I know) on the performance of the
> generated code.
>  
yes, and there is an option to remove nullcheck that is only available
on debug VM.
...
>
> lg Clemens
>  
Rémi

Re: Proposal for improving performance of TreeMap and others

by cowwoc :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Something very weird is going on. I tried profiling a minimal testcase and there is a considerable amount of "missing time". I am using a dev build of Netbeans 6.1 and it says:

MyComparator.compare(Object, Object) 19670ms
\-> MyComparator.compare(Integer, Integer) 10229ms
  \-> Self Time 3001ms
  \-> Integer.compareTo(Integer) 1575ms
\-> Self Time 3788ms

I spot at least three problems:

1) The individual item times do not add up to the total (but they do for other stack-traces).
2) Comparator.compare() self-time consumes more CPU than Integer.compareTo() even though it only invokes a method while the latter does actual computation.
3) Why is extra time consumed moving from MyComparator.compare(Object, Object) to (Integer, Integer)? It looks like Generics is doing something at runtime which consumes a large amount of cpu.

Gili

Clemens Eisserer wrote:
Hi cowwoc,

> I guess you're right. It is probably as likely that the JIT will optimize
> away the null check as it is that it will optimize away the
> NullPointerException check. One exception, though, is when production
> systems run using -Xverify:none. In such a case, wouldn't my approach run
> faster?
I don't think it will optimize the null-check away, however it is so
cheap that it most likely will not weight at all, compared to all the
other operations happening there. Its maybe 5 instructions compared to
thousands or even more.
-Xverify:none only disables bytecode verification at class-loading
time and has no influence (as far as I know) on the performance of the
generated code.

> I still think that my proposed code is somehow more consistent/cleaner on a
> design-level but I guess that's just me :)
I also like it more, its cleaner in my opinion :)

> As an aside, are there standard benchmarks for testing the impact of this
> change? I'd love to know whether it actually produces any performance
> difference in practice.
>From my experience i would rather guess that you won't notice the
change, noise will be higher.

lg Clemens

Re: Proposal for improving performance of TreeMap and others

by charlie hunt-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

It's likely what you are observing in #2 & #3 and possibly in #1 also is
an artifact of inlining and possibly other JIT (dynamic) compiler
optimizations.

You might consider re-running your experiment with inlining disabled,
-XX:-Inlining.

Or, alternatively try running your experiment (with inlining enabled)
with Sun Studio Collector / Analyzer.  Then, when viewing the results in
the Analyzer, filter (View > Filter Data), the samples so that you are
looking at a portion of samples after the code is warmed up.  And, also
look at the results in machine view mode (View > Set Data Presentation >
Formats > View Mode > Machine).   NOTE: In machine mode you can also
view the generated assembly code for each method.  So, you can really
get down to the specifics of what's being executed.

Fwiw, I did a comparison run of a TreeMap with your suggested changes
including removing "if (comparator == null)" checks with one of our
favorite SPEC benchmarks which does a pretty good job at exercising
TreeMap.compare().  Even with 18 degrees of freedom I found the changes
to have no significant improvement. I didn't look at, or compare the
generated assembly code for both versions TreeMap.compare().  Though
that might be kind of interesting.

So, from a performance perspective, it appears this SPEC benchmark shows
no change in performance.

hths,

charlie ...

cowwoc wrote:

> Something very weird is going on. I tried profiling a minimal testcase and
> there is a considerable amount of "missing time". I am using a dev build of
> Netbeans 6.1 and it says:
>
> MyComparator.compare(Object, Object) 19670ms
> \-> MyComparator.compare(Integer, Integer) 10229ms
>   \-> Self Time 3001ms
>   \-> Integer.compareTo(Integer) 1575ms
> \-> Self Time 3788ms
>
> I spot at least three problems:
>
> 1) The individual item times do not add up to the total (but they do for
> other stack-traces).
> 2) Comparator.compare() self-time consumes more CPU than Integer.compareTo()
> even though it only invokes a method while the latter does actual
> computation.
> 3) Why is extra time consumed moving from MyComparator.compare(Object,
> Object) to (Integer, Integer)? It looks like Generics is doing something at
> runtime which consumes a large amount of cpu.
>
> Gili
>
>
> Clemens Eisserer wrote:
>  
>> Hi cowwoc,
>>
>>    
>>> I guess you're right. It is probably as likely that the JIT will optimize
>>> away the null check as it is that it will optimize away the
>>> NullPointerException check. One exception, though, is when production
>>> systems run using -Xverify:none. In such a case, wouldn't my approach run
>>> faster?
>>>      
>> I don't think it will optimize the null-check away, however it is so
>> cheap that it most likely will not weight at all, compared to all the
>> other operations happening there. Its maybe 5 instructions compared to
>> thousands or even more.
>> -Xverify:none only disables bytecode verification at class-loading
>> time and has no influence (as far as I know) on the performance of the
>> generated code.
>>
>>    
>>> I still think that my proposed code is somehow more consistent/cleaner on
>>> a
>>> design-level but I guess that's just me :)
>>>      
>> I also like it more, its cleaner in my opinion :)
>>
>>    
>>> As an aside, are there standard benchmarks for testing the impact of this
>>> change? I'd love to know whether it actually produces any performance
>>> difference in practice.
>>>      
>> >From my experience i would rather guess that you won't notice the
>> change, noise will be higher.
>>
>> lg Clemens
>>
>>
>>    
>
>  


Re: Proposal for improving performance of TreeMap and others

by cowwoc :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


        That's good news, I guess ;) because in my minimal testcase that had
nothing to do with TreeMap it looked like using a Comparator to wrap
natural ordering degraded performance by an order of magnitude... which
is really bad :)

        If the same isn't true for the actual TreeMap this change might be
worth considering for its code-cleanup potential.

Gili

charlie hunt wrote:

> It's likely what you are observing in #2 & #3 and possibly in #1 also is
> an artifact of inlining and possibly other JIT (dynamic) compiler
> optimizations.
>
> You might consider re-running your experiment with inlining disabled,
> -XX:-Inlining.
>
> Or, alternatively try running your experiment (with inlining enabled)
> with Sun Studio Collector / Analyzer.  Then, when viewing the results in
> the Analyzer, filter (View > Filter Data), the samples so that you are
> looking at a portion of samples after the code is warmed up.  And, also
> look at the results in machine view mode (View > Set Data Presentation >
> Formats > View Mode > Machine).   NOTE: In machine mode you can also
> view the generated assembly code for each method.  So, you can really
> get down to the specifics of what's being executed.
>
> Fwiw, I did a comparison run of a TreeMap with your suggested changes
> including removing "if (comparator == null)" checks with one of our
> favorite SPEC benchmarks which does a pretty good job at exercising
> TreeMap.compare().  Even with 18 degrees of freedom I found the changes
> to have no significant improvement. I didn't look at, or compare the
> generated assembly code for both versions TreeMap.compare().  Though
> that might be kind of interesting.
>
> So, from a performance perspective, it appears this SPEC benchmark shows
> no change in performance.
>
> hths,
>
> charlie ...
>
> cowwoc wrote:
>> Something very weird is going on. I tried profiling a minimal testcase
>> and
>> there is a considerable amount of "missing time". I am using a dev
>> build of
>> Netbeans 6.1 and it says:
>>
>> MyComparator.compare(Object, Object) 19670ms
>> \-> MyComparator.compare(Integer, Integer) 10229ms
>>   \-> Self Time 3001ms
>>   \-> Integer.compareTo(Integer) 1575ms
>> \-> Self Time 3788ms
>>
>> I spot at least three problems:
>>
>> 1) The individual item times do not add up to the total (but they do for
>> other stack-traces).
>> 2) Comparator.compare() self-time consumes more CPU than
>> Integer.compareTo()
>> even though it only invokes a method while the latter does actual
>> computation.
>> 3) Why is extra time consumed moving from MyComparator.compare(Object,
>> Object) to (Integer, Integer)? It looks like Generics is doing
>> something at
>> runtime which consumes a large amount of cpu.
>>
>> Gili
>>
>>
>> Clemens Eisserer wrote:
>>  
>>> Hi cowwoc,
>>>
>>>    
>>>> I guess you're right. It is probably as likely that the JIT will
>>>> optimize
>>>> away the null check as it is that it will optimize away the
>>>> NullPointerException check. One exception, though, is when production
>>>> systems run using -Xverify:none. In such a case, wouldn't my
>>>> approach run
>>>> faster?
>>>>      
>>> I don't think it will optimize the null-check away, however it is so
>>> cheap that it most likely will not weight at all, compared to all the
>>> other operations happening there. Its maybe 5 instructions compared to
>>> thousands or even more.
>>> -Xverify:none only disables bytecode verification at class-loading
>>> time and has no influence (as far as I know) on the performance of the
>>> generated code.
>>>
>>>    
>>>> I still think that my proposed code is somehow more
>>>> consistent/cleaner on
>>>> a
>>>> design-level but I guess that's just me :)
>>>>      
>>> I also like it more, its cleaner in my opinion :)
>>>
>>>    
>>>> As an aside, are there standard benchmarks for testing the impact of
>>>> this
>>>> change? I'd love to know whether it actually produces any performance
>>>> difference in practice.
>>>>      
>>> >From my experience i would rather guess that you won't notice the
>>> change, noise will be higher.
>>>
>>> lg Clemens
>>>
>>>
>>>    
>>
>>