Do Repeat Yourself. Avoid the Wrong Abstraction.
Software engineers trap themselves when they follow the DRY principle too closely. Sandi Metz accurately describes how the wrong abstraction is worse than having duplicate code.
I'm a big fan of this blog post https://www.sandimetz.com/blog/2016/1/20/the-wrong-abstraction by Sandi Metz about The Wrong Abstraction. I strongly encourage you to read the post, especially because it is a thoughtful look that questions the DRY (Don't-Repeat-Yourself) principle of software orthodoxy that broadly forbids duplicate code.
This blog post spelled out something that bothered me for years about the codebases I worked on where the code adhered strongly to DRY, but the architecture of the code was a jumble. For example, I worked on a codebase once that very closely fits the following anti-pattern Sandi Metz describes in the blog post:
- Programmer A sees duplication.
- Programmer A extracts duplication and gives it a name. This creates a new abstraction. It could be a new method, or perhaps even a new class.
- Programmer A replaces the duplication with the new abstraction. Ah, the code is perfect. Programmer A trots happily away.
- Time passes.
- A new requirement appears for which the current abstraction is almost perfect.
- Programmer B gets tasked to implement this requirement. Programmer B feels honor-bound to retain the existing abstraction, but since isn't exactly the same for every case, they alter the code to take a parameter, and then add logic to conditionally do the right thing based on the value of that parameter. What was once a universal abstraction now behaves differently for different cases.
- Another new requirement arrives. Programmer X. Another additional parameter. Another new conditional. Loop until code becomes incomprehensible.
- You appear in the story about here, and your life takes a dramatic turn for the worse.
We had a pile of up to 5 abstraction classes in Python that, in the worst case, went "Base"->"BaseThing"->"BaseAPIThing"->"HABaseThing"->"The Actual Thing under development". The other 4 things lived in a library of helper files that multiple "Actual Things under development" depended on.
In exchange for not duplicating code, we could not make updates to the shared code without potentially breaking 60-70 other libraries that all listed the shared code as a dependency whether or not they needed everything the abstraction provided. And if a change was serious enough that we had to make the change, we then had to test all of the other libraries to make sure they did not break. Making changes became incredibly expensive and difficult.
Sandi Metz says we should accept duplicate code because you preserve your ability to change code easily and quickly, and I agree with her 100% now. I agree with her that choosing the wrong abstraction is much worse than having extra code here and there that you might have to change later. I think waiting til the third, fourth, or fifth time you duplicate code is a better way to make sure you are making the right abstraction.