equals in terms of equalHashValues?

View: New views
3 Messages — Rating Filter:   Alert me  

equals in terms of equalHashValues?

by Landei :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi!

I found this comment in the new Hashable trait in the trunk:

* If you define equals in terms of equalHashValues then your hashCode and equals
* methods will never be out of sync.  Something like:
*
* override def equals(other: Any) = other match {
*    case x: YourClass => this equalHashValues x
*    case _            => false
* }

While I think it's a good idea to have convenient hash code support, I think this advice is misleadiung and dangerous: Of course your equals and your hashCode is "in synch" if you follow this advice, but this is certainly not what you want (in 99% of the cases). I mean if you have a large number of objects, there is a realistic chance for a hash-clash: There are only 2^32 Ints. And this could become a really ugly bug to hunt.

This is not only a theoretical question, as Range already *uses* an equals method as described. So I wrote a little test to count the clashes for Range (please correct me if I didn't understood the algorithm correctly):

def hash_code(values:List[Int]): Int = values.foldLeft(0)((x, y) => x * 41 + y)

def main(args:Array[String]) {
    var codes = Map[Int, (Int,Int,Int)]()
    var count = 0
    for(upper <- 2 to 50; lower <- 1 to upper-1; step <- 1 to 5) {
            val code = hash_code(List(lower,upper,step))
            codes.get(code) match {
               case None => codes += code -> ((lower,upper,step))
               case Some(x) => println("Equal hashCode " + code + " for " + x +
                                       " and " + ((lower,upper,step)))
                   count += 1
           }
    }
    println("" + count + " clashes")
}

For the given tiny range 1 <= lower < upper <= 50, 1 <= step <= 5 I got 140 clashes:
Equal hashCode 3486 for (2,3,1) and (1,44,1)
Equal hashCode 3487 for (2,3,2) and (1,44,2)
Equal hashCode 3488 for (2,3,3) and (1,44,3)
...

I wouldn't call Range "hashable" (as in: "Suited as key for a HashMap") anymore.

Cheers,
Daniel

Re: equals in terms of equalHashValues?

by Paul Phillips-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thu, Jun 11, 2009 at 01:03:59PM -0700, Landei wrote:
> This is not only a theoretical question, as Range already *uses* an
> equals method as described. So I wrote a little test to count the
> clashes for Range (please correct me if I didn't understood the
> algorithm correctly):

For sure that trait is highly experimental (I just have way too much
code in local branches and am attempting to work some of it into trunk.)

That said, I'm not sure what you think is happening in there, but who
cares if unequal ranges have equal hashcodes? Unequal objects have equal
hashcodes all the time.

Equality is defined in terms of the list of values you provide.  This is
exactly like it's done in case classes and oh roughly everything.

  // you define this
  protected def hashValues: List[Any]

And then equals is defined as: hashValues == other.hashValues.

Also, that isn't Range, it's GenericRange.

--
Paul Phillips      | It is hard to believe that a man is
In Theory          | telling the truth when you know that you
Empiricist         | would lie if you were in his place.
slap pi uphill!    |     -- H. L. Mencken

Re: equals in terms of equalHashValues?

by Landei :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Paul Phillips wrote:
On Thu, Jun 11, 2009 at 01:03:59PM -0700, Landei wrote:
> This is not only a theoretical question, as Range already *uses* an
> equals method as described. So I wrote a little test to count the
> clashes for Range (please correct me if I didn't understood the
> algorithm correctly):

For sure that trait is highly experimental (I just have way too much
code in local branches and am attempting to work some of it into trunk.)

That said, I'm not sure what you think is happening in there, but who
cares if unequal ranges have equal hashcodes? Unequal objects have equal
hashcodes all the time.

Equality is defined in terms of the list of values you provide.  This is
exactly like it's done in case classes and oh roughly everything.

  // you define this
  protected def hashValues: List[Any]

And then equals is defined as: hashValues == other.hashValues.

Also, that isn't Range, it's GenericRange.

--
Paul Phillips      | It is hard to believe that a man is
In Theory          | telling the truth when you know that you
Empiricist         | would lie if you were in his place.
slap pi uphill!    |     -- H. L. Mencken
Hello Paul,

I'm sorry, it took a while for me to understand where I was off. I looked at the code in Hashable, saw the hashValues list, and then I saw how this list was compared in equalsHashValues. Somehow I came to the wrong conclusion it would just compare *hashCodes*. Of course it just compares the same *values* that are also used for calculating the hashCode.  

Sorry for the noise.

Cheers,
Daniel