How to debug code
Debugging is difficult. Your code doesn’t work correctly. You go through the instructions step by step while observing the variables and trying to spot the place where you get an incorrect value. Perhaps, you don’t even know which values are correct. After all, if you knew, you would have written automated tests instead of meandering through the code.
Can we make debugging easier?
The best approach: automated testing
If you can, write automated tests for your code. I say, “if you can,” because your code may be an untestable mess, or you may not know what you expect to see.
You can always refactor the mess and make the code structure more test-friendly. I have written lots of articles about it. Take a look at:
- How to add tests to existing code in data transformation pipelines
- The secret of working with legacy code on a software team
“I don’t know what I expect to see, but I will recognize it when I get it.” is probably the worst situation you can get into as a programmer. How will you proceed when you don’t know the desired outcome?
Take a step back. Read the task description, documentation, your notes, emails, and everything that may help you figure out what you are trying to accomplish. Ask around. Someone must know the information you need. Don’t count on having a pleasant conversation. It’s difficult to admit a lack of knowledge when you have been coding for a long time without learning anything about the business domain. But you are to blame. You should have asked your questions earlier.
The second best approach to debugging
What if you know more or less what you expect but don’t understand interactions between modules? You are stuck with a debugger. You have no way to write useful tests. How do you solve the problem?
Are you familiar with the etymology of the term “debugging” in programming? If not, please read the Wikipedia article first. If you are familiar, you probably remember the famous piece of paper with a moth taped to it. What about the rest of the page? Do you remember what they were doing?
They wrote down everything they were doing, when they did it, and what happened next. We should mimic that practice!
Every time you debug, you should take a piece of paper or open a text editor and write down the date and what you are trying to accomplish.
Now, start debugging.
Write down the time and the input parameters you use. Can you explain why you use those parameters and what you expect to see? Document your assumptions. If you spend many hours debugging without solving the issue, you will need to question your assumptions. But you won’t remember them anymore. You should have a log of your assumptions. Especially the “obvious” ones — the ones you think you don’t need to write down because everyone knows them.
Go through the code in the debugger and write down what you see and whether it’s something you expect to see. When you see something odd and have a hypothesis regarding the required code change, stop. Don’t make the change yet. First, think about automating the check. Can you write a test? You know the input values. You know your assumptions and the results you expect. In addition, you more or less know what part of the code you must edit. Can you make this part testable and write the test? Give it a try. Automated tests are way better than manual debugging.
Now, change the code and see what happens. You can look at the test results or debug again.
If you haven’t solved the problem, write down the results, think about the next step, and repeat the process.
Why is a written log important while debugging?
When you have a written log, you avoid running in circles. You won’t try the same thing because you see you have already done it. If you have noted your assumptions, you can show the log to someone else and ask whether you were right. Without the log, you will forget what you checked, the input parameters, and what you have seen as the results. Even worse, you may have a “false memory without the written log.” You may recall something that never happened because you tried so many cases you got confused.
What should you do with the log after fixing the problem? I usually add the log as a comment in the Jira ticket. Of course, nobody will ever look at it again. But storing the record feels better than removing the file or throwing the paper into a rubbish bin.
Did you enjoy reading this article?
Would you like to learn more about software craft in data engineering and MLOps?
Subscribe to the newsletter or add this blog to your RSS reader (does anyone still use them?) to get a notification when I publish a new essay!
You may also like
- Data/MLOps engineer by day
- DevRel/copywriter by night
- Python and data engineering trainer
- Conference speaker
- Contributed a chapter to the book "97 Things Every Data Engineer Should Know"
- Twitter: @mikulskibartosz