The importance of documenting things
A few weeks ago, a colleague asked me about a piece of code I had written. Unfortunately, I couldn’t answer any of his questions. The code was something we call “self-documenting code.” There were meaningful function and variable names, I did not use any clever tricks, there were tests. I think the assertions and test names were readable and could be used to figure out what I was trying to accomplish.
But it was not enough.
The problematic code was an actor (if you do not know what it is, read about actor systems). Let me explain why it was problematic.
I used the word “synchronous” in the actor name
Actors are synchronous. An actor never performs two actions at the same time. You can create more instances of the same actor or use it to start processing something in a different thread, but one actor is doing only one thing at a time. So why did I call it “synchronous”? It did not make sense. I did not remember why I used that name.
The actor was creating a JSON object and sending a HTTP request, nothing fancy
Why did I need a separate actor implementation for something like that? What was so special about it?
It was using a non-blocking HTTP client
It did not need to wait for the response. The actor could start sending another request even if the previous had not been completed yet. That’s another reason why calling it a “synchronous actor” did not make sense.
The behavior of the actor depended on previously processed messages
In Akka (and probably other actor systems), there are become/unbecome methods that allow you to change the actor’s implementation depending on a received message. My actor behaves in two different ways:
- When it receives a message, it sends an HTTP request and switches to a state in which it only awaits the HTTP response and cannot process new messages.
- When it receives an HTTP response (or an error), processes the response and returns to the state in which it accepts new messages.
It was stateful, and it was pretending to be synchronous. It had a horrible bug that could cause problems if the server never responded. Yet, nothing in the code explained why I chose to do it in that way.
Why it was a problem?
Nothing was explaining the reasoning behind the code. It was not mentioned in the documentation. There were no comments in the source code. When the colleague asked me about it, I could not explain it, because I had already forgotten everything.
It seemed I had seriously over-engineered the code for no reason. Maybe, I was trying to prevent the actor from sending too many requests to the server at the same time.
Could the server handle more requests? Was there a real reason, or was I too careful? I have no idea.
We could assume that there is a valid reason for keeping the code and do not touch it. We could be afraid to remove it because “something may brake and it is going to be our fault.” Fortunately, we know we can restore the previous version of the service with a single click of a button.
There was no reason to keep the code, nobody knew why it is there. Hence we removed it and… nothing happened.
I am absolutely sure there was a reason to write that code when I was writing it. Now I know that we no longer need it. I hope so… Maybe one day something is going to fail because of that change.
Why you need comments?
It should not happen, I should not need to experiment with the software to check why the code was written. Decisions should be documented. Can we start writing comments instead of pretending that the self-documenting code exists?
The comments should not explain what your code is doing or how it is doing it. You have the code and its tests for that.
What I need in comments is an explanation of why I should not remove that code. If you have made a decision that affects the application’s architecture, please tell me about it in the comments. If you have tried a few different approaches before writing the code, tell me about that. So I do not waste time thinking, “maybe we should have done it differently.” If you don’t want to clutter the source code with such information, write an Architecture Decision Record in a separate text file.
Did you enjoy reading this article?
Would you like to learn more about software craft in data engineering and MLOps?
Subscribe to the newsletter or add this blog to your RSS reader (does anyone still use them?) to get a notification when I publish a new essay!
You may also like
- Data/MLOps engineer by day
- DevRel/copywriter by night
- Python and data engineering trainer
- Conference speaker
- Contributed a chapter to the book "97 Things Every Data Engineer Should Know"
- Twitter: @mikulskibartosz