Misuse of polymorphism

Early use of C++ led to a frequent anti-pattern: monstrous class hierarchies could only be understood after graphing in UML. They were over-engineered, hard to use, and hard to modify. We found ourselves forced to extend concrete classes by cutting and pasting the same boilerplate examples. Yet the justification for this mess was "code reuse."

It took about a decade and better examples in Java for most everyone to discover that containment and interfaces leads to much better code reuse than concrete inheritance.

But I still see the spaghetti UML anti-pattern often enough that I think it is worth identifying the cause and how it might be avoided.

Many students first learn two things about object-oriented design: objects allow inheritance, and objects allow greater code reuse. It is easy to imagine that one directly leads to the other. Polymorphism takes a bit longer to grasp.

Code, in its essence, is a collection of data-structures and operations on those data-structures. It is tempting to think of objects as data-structures that also contain their supported operations. In that case, you would reuse operations by inheriting them from another class. This "mix-in" pattern can easily lead to bloated classes with many methods implemented just in case. To reduce the methods introduced by any single class, hierarchies grow taller and wider, in the hope that intermediate classes will be useful by themselves.

But polymorphism is about more than inheritance.

Polymorphism allows operations to apply to a wide variety of data-structures using only the features those data-structures have in common. Algorithms and services have classes independent of the data implementation. Data-structures only need to expose a minimal number of methods required by those external services. (Functional programmers call such methods the object "protocol.") These methods can be captured in a pure abstract base class, i.e., an interface.

So, pure polymorphism only requires the use of interfaces. Polymorphism alone should not encourage a class hierarchy of any depth at all.

If we want to reuse a particular implementation of a data-structure, then inheritance is a more reasonable solution, though object containment still provides better encapsulation. If the goal is reuse of behavior, then no client should be forced to reuse the data-structures as well.

A rigid inherited hierarchy forces a client to create and manipulate objects in a certain way. A derived class must remain aware of everything inherited; otherwise, internal data-structures may be left in an improper state. The copying of boilerplate template code is the telltale symptom of this problem. With proper polymorphism, a client can implement a minimal orthogonal interface, and the service will use that interface in the appropriate way.

Bill Harlan, February 2009

Return to parent directory.