Monday 13 May 2013

Solutions for mixed type Object equals() in Java

Equals have OOP inheritance have an uneasy relationship. Joshua Bloch famously stated that "there is no way to extend an instantiable class and add a value component while preserving the equals contract" (page 38, Effective Java, item 8). He suggests as the only solution always using Composition over Inheritance. But what if you have to stick Inheritance?
I am going to introduce you 5 (!) different solution for convenient comparison. To demonstrate the principles notorious classes Color and ColorPoint are used. You find them in Joshua Bloch's book, in official Java Language Specification or in Martin Odersky's blog so you should be familiar with them. Full implementation source code of all solutions in ready to use examples and unit tests can be found in my Bitbucket Git repo.
Where is the problem with inheritance and equals()? It is difficult to reconcile mixed type equality comparisons where subclass adds extra key fields. Problem is Point does not know about extra field in ColorPoint, so from its perspective they are equal, but from ColorPoint they are not - broken symmetry.
To demonstrate the problem I have prepared this two examples:
  • A1EqualsAndInheritanceUntreatedCodeBreaksSymmetry. This example presents implementation of equals() based on instaceof class comparison and test it on mixed type comparisons of Point and ColorPoint. This breaks symmetry when extra field
  • A2EqualsAndInheritanceUntreatedCodeBreaksLSP. Presents implementation of equals() provided by generators from famous IDEs like Eclipse or IntelliJ Idea based on strict class comparison – this.getClass().equals(other.getClass()). While it solves problem with broken symmetry and adheres fully to equals() contract, it blocks any mixed type comparisons, even between compatible classes like Point and PointAdapter. This breaks Liskov Substitution Principle!
For most people, even experienced developers, even for me it ended here – not possible, use composition. However on recent project, Hibernate based, we had several discussions about equals() in general and more importantly we were stuck with Entities heavily dependent on inheriting common fields. I did some research, most valuable appeared to be this StackOverflow discussion which set my direction. I have conveniently prepared all distinct solutions I found to discuss them with you.

Restricted mixed type equals() solutions

Solutions for Mixed Type equals() where Point genuinely cannot be equal to ColorPoint (extra field), but can be equal to PointAdapter (no extra fields).
  • Tal Cohen's blindlyEquals solution. It is included mainly because is often cited. It is the least effective solution but it is most intuitive and therefore good to start with.
    It is named after a required helper method – blindlyEquals. This method contains what equals() normally contains but it not bound with equals() contract. The equality is then established when both parties agree that they are blindly equal to each other. It is a sort of two phase commit. This obviously ensures symmetry. The drawbacks are obvious too. The equals() is in fact called twice, all fields has to be checked twice, for every direction.
    See A3EqualsAndInheritanceTalCohenSolution.
  • Rene Smith's getEquivalenceClass solution - as a reaction to Tal Cohen. Every class specify by overriding or not overriding helper method getEquivalenceClass() what class is relevant type comparison, what is the minimal required type. For Point it is Point, for ColorPoint it is ColorPoint, so far boring routine but here it comes, for ColorPointDecorator is is still ColorPoint because it does not override the helper method. This way equality between ColorPoint and Point is blocked, symmetrically blocked, but allowed for decorator class bringing no extra key fields.
    See A4EqualsAndInheritanceReneSmitSolution.
  • Martin Odersky's canEqual solution - where canEquals() is a helper method. It is in fact just a variant of previous example. This time helper method canEquals() does the whole class type comparison inside, it specifies minimal required class type internally, returning result as boolean.
    See A5EqualsAndInheritanceMartinOderskySolution.
  • Simple Reversing solution. My own implementation, based on suggestions in Angelika Langer article. I strongly suggest this solution. It does not require any helper method as do previously mentioned solutions. It is basically as simple as adding just 2 short lines to a standard equals() implementation everyone is familiar with.
    See A6EqualsAndInheritanceSimpleReversingSolution.

    Simple Reversing solution goes in two variants. Second variant uses repeated reversion blocking flag. Then we can afford revert direction even for same classes, dropping need for that extra check. It makes some thins simpler, and perhaps is a bit more performant, but it requires helper method - flagged version of equals().
    See A6EqualsAndInheritanceSimpleReversingSolutionAlt.

Unrestricted mixed type equals() solution

Solutions for mixed type equals() where Point genuinely can be equal to ColorPoint. In this case Point is considered a point with unspecified color, that is default color, let's say it is colorless by default, and therefore it can be compared to ColorPoint.
This approach assume every key field has a specified default value. If only one field cannot be assigned a default value then types are deemed incompatible for mixed type comparison.
There can be more than one default value per field but let's not complicate it for now.

Angelika Langer's Traverse Hierarchy solution

It is the only solution of all listed here, capable of comparing classes across different inheritance branches. You can compare Point3D with ColorPoint3D with ColorPoint or with root Point. This solution is fully and unarguably and 100% Liskov Substitution Principle compliant. It is also the most complex and difficult to understand solution.
Key points of Angelica Langer's solution are:
  • Every key field has defined default value
  • Hierarchy traversal is recursive process.
  • Hierarchy is traversed from the most specific subcass to more general class, that is bottom up principle. Equals is then conducted only in one direction – guarantees symmetry.
  • Order of comparison is reversed if (more specific) subclass is detected to achieve bottom up principle.
  • Comparison can continue up in the hierarchy only when all specific fields have default values.
  • 2 helper methods are required: navigateClassHierarchy(other, direction) and compareFields(other). Equals method can be defined at the root class, subclasses can safely inherit it. Helper methods have to be implemented in every class having extra key fields.
  • Equal ensures that both compared instances are part of same class hierarchy, that is they both extend Point, and fails early when they do not. Otherwise equals() is just a kick starter of hierarchy navigation and plays no further role.
  • If compared classes are on different hierarchy branches, comparison goes up (subclass to superclass) comparing fields on the way against the defaults, until common class root is found, then order is reversed to the direction from compared object class, it goes up the hierarchy again, comparing fields against the defaults, until it reaches common root again, it continues up again, comparing fields against each other, because we are on the same base class finally; until it reaches absolute root – Point in our case. If it reached this point, all the way down, from possibly two hierarchy branch leaves, we can finally certify them as equal.
  • The flag reverseOrder prevents traversal from sliding back to the first branch again when reaching common root from second branch. This would resulting in endless recursion. Flag reverseOrder can change value just once. From false to true, or not at all.
For the dear reader's convenience I have split Angelika Langer's implementation into two files, two gradual steps. First A7EqualsAndInheritanceAngelikaLangerSolution contains full implementation with same set of test as in previous examples, for easy comparison. Attached unit tests show that now ColorPoint can be equal to Point if the point has default color (that it is colorless) and symmetry is not broken.
But to demonstrate full power of Angelika Langer's hierarchy traversing solution I have to add extra sample classes and couple of new tests. The new classes are Point3D and ColorPoint3D; Point3D extends Point directly and ColorPoint3D, as the name suggests, extends ColorPoint. Two separate hierarchy branches formed here. Despite this extra complication we are able to match them equal when common fields are equal and specific fields having default values. That is ColorPoint3D(2, 3, 0, Color.TRANSPARENT) is equal ColorPoint(2, 3, Color.TRANSPARENT) or Point(2,3) or Point3D(2,3,0).

If you had the patience...

If you had the patience to read through this you are welcome. I spent some time researching on the subject, challenging paradigms is uplifting task. I prepared all 4 solutions I was able to find, plus one of my own provenance, for easy comparison, and I am happy to share the with you.
Inheritance issue with object equals() is often overlooked problem and even if you don't use any
If you thing you know better distinct solution, don't keep it for yourself and let other know.
If you are undecided which solution to use I recommend to stick with restricted mixed type equals(). In most cases you can rather argue that ColotPoint is not the same as Point and they should never equal. And from the restricted mixed type equals() I recommend Simple Reversing solution.
It is pretty much about copy and pasting 2 short lines to your existing code.

Side Notes

  • Unit testing in my examples is by no means extensive, its main purpose is to demonstrate the presented principle and provide ready to debug code for relevant cases.
  • For demonstration purposes I include whole code for discussed case into one file. Tests are mixed with tested implementation. This is not recommended for regular unit test an real projects. It is not my normal way of unit testing!
  • Java instanceof operator checks for null automatically, explicit testing is unnecessary. Use if (!(other instanceof Point)) is NPE safe. Expression null instanceof Point returns false. Yet I commonly see unnecessary constructs like:
    if (other == null) return false;if (!(other instanceof Point)) return false;

No comments:

Post a Comment