Primitive Obsession
Primitive obsession is a fairly common code smell. It happens when you use primitive data types, like integers, strings, arrays, to represent concepts from your domain, e.g. using float to represent money.
The fix is apparently simple. Just create a new class to represent the concept. James Shore supports this idea. Even if you end up creating small classes, they will eventually grow.
I believe creating classes thinking they will grow is just BDUF. Dealing with primitive obsession is much more than just creating new classes. It is based on:
- deep comprehension of your domain to create meaningful classes that represents key concepts to make the code more communicative;
- removing duplication and augmenting cohesion, leading to a more expressive and maintainable code.
J.B. Rainsberger (jbrains) shared his thoughts in this post and in this great video with Corey Haines. It is only 14-minutes long, really worth watching.
I like jbrains’ thinking about complex ideas and contexts. These concepts are respectively related to creating meaningful classes and the model domain said above.
Jeff Atwood also takes complexity in account when coping with primitive obsession. He suggests: “If your data type is sufficiently complex, write a class to represent it.”
Duplicate code and low cohesion go hand in hand. Low cohesion makes related ideas spread wildly through out the codebase, which leads to duplicate code to handle those values. This symptoms generally are from primitives types being overused and a new class is begging to be created.
In practice, I would not create an Age
class for user profiles in a
video rental software. Using primitive integer in this context works
just fine. On the other hand, I would probably create one for a biology
laboratory software which the age of months or even days can have great
impact on the system.
And even if a concept is not that essential to a domain, e.g. we want to
store the location of restaurants for an online booking service. I would
be tempted to create a GeoLocation
class, because pairs of [lat,
lon]
are extremely low cohesive, unexpressive and complex enough, thus
they deserve a class of its own. This will avoid duplication and provide
better encapsulation and slim
API.