-
-
Notifications
You must be signed in to change notification settings - Fork 616
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
poor implementation of hashCode of LinkedHashMap #2733
Comments
Thank you for your finding and the minimal example! What do you mean by "Java implementation is right"? Do you suggest to use the hash algorithm used by Java's LinkedHashMap? |
I mean Java native hashCode implementation for HashMap returns different hashcodes for that example. The problem I faced is that I had a grid modeled with a linkedhashmap. The grid had always the same elements BUT in different positions. The current implementation returns always the same hashCode no matter the positions of the elements. Since I was storing the grids in a Set and all of them had the same hashcode, the performance was really bad... |
I made a stab at this, but I'm not sure if its the right way to "fix it". |
@jarlah You can not change hashCode without changing equals. The result of these two methods must agree. The contract for these two methods is defined in java.lang.Object. So, always change them both, and you are good. I believe, there is a contradiction between the API of the interface Traversable and the classes that implement it and their unit tests. (?) Interface Traversable states that hashCode and equals is different for collections with predictable iteration order and for collections with arbitrary iteration order. However, TreeMap and LinkedHashMap implement hashCode and equals for unordered collections: Maybe I am just confused about the Javadoc in Traversable? TreeMap.isOrdered() returns true, TreeMap.isSequential() returns false. I believe this makes TreeMap a collection with predictable iteration sequence. LinkedHashMap.isOrdered() returns false, LinkedHashMap.isSequential() returns true. I believe this makes LinkedHashMap a collection with predictable iteration sequence. |
Yes, of course. What i did was just to make a PR hightlighting how easy it was to change the behaviour. But it totally is not ready. |
@jarlah What do you think about the Javadoc in Traversable.hashCode and Traversable.equals? |
I think vavr is making itself a disservice by redifining such concepts. But i guess its ok, because vavrs collections will only be used and compapred to with vavr collections. I see your point @wrandelshofer that the doc talks about predictable iteration order and arbitrary iteration order. But i cannot say what it means. Or if you are right. What i can say however is that to fix my equals in my PR i would need to sort both collections and compare the two sorted collections. I dont think thats very effective, and im worried about the performance impliciations. |
Yes, I can not tell whether it is correct or incorrect. I would err on the side of the implementation, and assume that the hashValue/equals depends on the collection types Set/Seq/Map. I - personally - would not go in the direction of your fix. It is convenient to be able to check sets and maps for equality/hashCode regardless of their iteration order. But I am not the designer. So, it is a viable design direction, of course. |
For me - ideally vavr Collections can be swapped in and out with java.util Collections. The only difference being that the vavr Collections have persistent mutability, and have no API methods that can throw UnsupportedOperationException. |
I understand that the hash is computed via the underlying state, but - to me - the confusion arises in how those two hash maps - {"a": 1, "b": 2} and {"a": 2, "b": 1} - have the same underlying state? Wouldn't the keys and the values be associated with each other, regardless of any ordering concerns? In that case, why wouldn't the hash code calculation take the key-value relation into consideration? Playing with this a little because I'm bored and stalling on my "real" work, I can confirm that the only thing that seems to matter for the hash code is that all the keys and values are included - ordering of the keys doesn't matter. So {"b": 1, "a": 2} would have the same hash code as the other examples. This also isn't an edge case for a 2-pair map; I wasn't expecting that it was, but I did it with 3 pairs with the same results. It also doesn't appear to be a fluke where those permutations just happen to generate the same hash code - which would be unlikely, but there's no reason it would be impossible. It's acceptable - and even expected - that random states for a given object will have the same hash code, even if they're not equal; the important thing is that they have the same hash code when they are equal. So you could always return 1 as your hash code and you'd meet the contract requirements. You'd just have horrible performance if you used it as a key in a hash map. (And there's probably other places that use the hash code, but that's the obvious one at the moment.) From @wrandelshofer:
It's a little more subtle than this; in this particular case, where it appears that As an aside, while |
System.out.println(LinkedHashMap.of("a",1, "b",2).hashCode());
System.out.println(LinkedHashMap.of("a",2, "b",1 ).hashCode());
Both maps have the same hashCode and this is wrong. I found this bug while participating in Advent of Code. An algorithm was extremely slow because of that. Java implementations is right
The text was updated successfully, but these errors were encountered: