How writing can improve your programming skills
Do you want to be a better programmer? You should write more. I’m not talking about writing code. Instead of code, you should write more texts for humans. It doesn’t matter what you are writing if the text is long enough to cover one idea and the required argumentation. Let me tell you why writing makes you a better programmer. In this text, I’ll focus on the similarities between programming and writing non-fiction. Although, in my opinion, fiction writers follow similar rules.
Writing is rewriting
Every news article, blog post, or book is edited multiple times before the reader has a chance to see the content. The first draft is never published. Somehow, programmers tend to create a pull request as soon as the code more-or-less works as expected. Such a willingness to gather early feedback is noteworthy. Still, posting the pull request is not the end of the work!
Don’t send the pull request to your colleagues right away! Not yet! You should be the first reviewer of your code. Read it. Read it to find parts to simplify and clarify.
Getting the code to execute and work was the first part. Everyone who has access to StackOverflow will eventually produce working code. Working code isn’t enough. At some point, every useful program must be modified. That’s why we call it software. Code is never done. Being modified is the purpose of code! Because of that, the result must be easy to understand and change. People read the code!
Non-fiction writers may have diverse goals. They may want to sell you a product or persuade you of their opinion. Non-fiction authors may wish to teach you something or share their point of view. Our job is easier. We should have only one goal - helping others understand and change the code.
What should you do after publishing a pull request? Take a short break from looking at your code. (You can review other people’s PRs during the break) After the break, take a look at the pull request. Does it still make sense? Can you simplify or clarify something? Is it at the right level of abstraction? Is it clear how to use the functions you created? Can someone misuse them without noticing the problem?
The purpose of code review
After the self-review and the first rewriting, you can send the PR to someone else. Why would they review the code? What is the goal?
Recently, I saw a bizarre opinion: “If you hire only good senior programmers, you don’t need code reviews anymore because nobody needs to check whether they made a mistake.” So many things are wrong with this statement, but let’s focus on one. Checking whether the code works isn’t the reason why we do code reviews!
A code review isn’t a seal of approval saying, “I agree to have this code in my repository.” A reviewer isn’t a teacher checking your homework. An approved PR isn’t a passing grade! Code reviews are not about gatekeeping or enforcing your perspective on the “proper” way to code.
We review the code for two reasons. First, to check whether another person can understand the code written by the author. It is an opportunity to ask for clarification. Some may end up in the code - either as a comment, improved naming, or more explicit structure. The second reason is knowledge sharing. A good code review is a tremendous learning opportunity for everyone: the author, the reviewer, and everyone who reads the PR later.
Of course, in open source projects, code reviewers must also verify whether the author adds a backdoor to the software or makes any harmful changes. However, those are extra tasks, not the only purpose of code reviews. The primary goal remains the same - making sure that others can work with the new code.
In the case of programming, we have a safety net - automated testing. If I make a breaking change in the code, I will notice it when some tests fail. I can’t run tests for my articles. Unfortunately, no automation will tell me whether I make a compelling argument, use the fitting analogy, or preserve the train of thoughts.
Therefore, we’ve no excuses during programming. Making the code more reader-friendly may be time-consuming, but it isn’t difficult. Unless you are one of the programmers who don’t write tests before the implementation. In this case, changing the code without changing the tests at the same time may be impossible. It happens because you (probably) overmocked everything. Your tests check whether you call five methods with the correct arguments instead of verifying the observable behavior.
In writing, we’ve two ways of explaining complex ideas. Both of them depend on the abstraction ladder. The only difference is whether we go up or down the ladder.
The abstraction ladder is a technique of teaching a topic by describing the details of it. We can either start with general statements and analogies and gradually go one level deeper. Then explain the same thing using more technical language, and go even deeper by explaining the technical concepts. The other method starts from the first principles and gradually deriving the concepts by going up from details to the general idea.
The same thing happens in programming. When we implement a feature, we start with a general function called “do whatever the feature does.” For example, we want to issue an invoice. In the function body, we go one level deeper and explain the steps of the business process. In our example, this is the place where we calculate the taxes, validate the tax id numbers, put the contact information. Afterward, we code the first technical layer, which implements the steps of a feature. In our invoicing application, this layer coordinates retrieving data from databases, sending events, or passing arguments to a PDF templating library. One level deeper, we create a layer handling the connections with databases, message queues, etc.
What happens when the author mixes abstraction levels? As soon as the programmer decides to put an event sending code in the middle of the business logic, the abstraction ladder collapses. It doesn’t matter whether it is one line of code. We should not mix abstraction levels even if adding another level of abstraction introduces a pass-through function.
Testing the Abstraction Ladder
Writing about the abstraction ladder allows me to share a hint about testing. How do we test such code? I suggest writing two kinds of tests.
The first testing approach goes through all abstraction layers. We start by calling the feature function and verify the observable behavior in the deepest abstraction layer, such as values in the databases, messages sent to a message queue, etc. Of course, we can mock those external resources to speed up the tests.
Some call this end-to-end testing. Some say it is integration testing. Choose whatever name you like. What I suggest, however, is defining such tests using Behavior Driven Development.
The second kind of test are tests going only one level deeper in the abstraction ladder. I recommend not overdoing them because such tests lead to overmocking and prevent any refactoring. It is a supplementary testing method. We use it when it is hard to test a part of code using end-to-end testing. Such tests are helpful when we extract a complicated function, and testing it through all abstraction layers is tedious.
Is this unit testing? I don’t like the name unit testing because programmers have a hard time understanding the meaning of the word “unit.” Unfortunately, for many, it means testing a single method/function.
For me, “unit” has two meanings. Unit means that I can run this test in isolation as a single unit, and my test verifies only one unit of behavior. What is a unit of behavior? If we write the use case down in the form of a tree data structure where every non-leaf node is a conditional statement - a unit of behavior is a single path from the tree’s root to a leaf node. We can write tests for units of behavior going through all abstraction ladder levels or going only one level deeper.
Many non-fiction styles exist
Programmers like consistent style. It is both a curse and a virtue. However, you don’t need to write all code in the same style. You don’t need consistent style even within a single application!
A common problem with keeping the style consistent even when it makes no sense is the MVP-style application, where every use case requires going through all layers: the model, the view, and the controller. In some cases, the controller is useless. Usually, in the read-only use cases when we retrieve a single value by id. It is just a pass-through code sending all of the received arguments one level further to the database accessing code. The code is redundant, but it exists. It exists because we write an MVP-style application, so we think we need all three layers—every time.
Not true! If some code is useless but exists only to keep the style consistent, we can remove it and use the deeper layer directly. It looks like a transgression of a law, a bad practice. Well, I admit it is dangerous, and it may quickly turn into bad practice. As soon as we start including business logic into the view layer in MVP-style apps, we should extract the code to a controller. I told you to avoid mixing abstraction levels, but it doesn’t mean you can’t skip redundant ones.
Does creating the controller prepare the code for future modifications? Maybe it does, but only if this part of the code ever gets modified. Programmers are terrible at predicting business needs. Usually, we waste a lot of time generalizing code to prepare it for some imaginary future requirements that never come. Programmers committed many, many code crimes in the name of generalizing the code and being ready for the future.
We don’t need a pass-through code. Every IDE has a keyboard shortcut to an “extract method” feature if a need ever arises. You can create the pass-through layer in three seconds when you need it.
Many literary forms exist because we need styles suitable for different occasions. Writing every application, every module, every file in the same file is like writing an essay whenever you want to communicate with someone. Sometimes a short email is sufficient, or a tweet, or just three words.
Did you enjoy reading this article?
Would you like to learn more about software craft in data engineering and MLOps?
Subscribe to the newsletter or add this blog to your RSS reader (does anyone still use them?) to get a notification when I publish a new essay!
You may also like
- Data/MLOps engineer by day
- DevRel/copywriter by night
- Python and data engineering trainer
- Conference speaker
- Contributed a chapter to the book "97 Things Every Data Engineer Should Know"
- Twitter: @mikulskibartosz