Precision vs. recall - explanation

Why couldn’t I remember the difference between precision and recall?

First of all, I had a problem with confusion matrix. The fact that the order of cells is different in textbooks and in the output produced by the tools we use was particularly problematic.

What is a column? Is it the actual value or the predicted value? Is true positive in the upper left cell or the lower right? I knew that both precision and recall are just rations between the confusion matrix cells, but which ones? It was kind of confusing ;)


Fortunately, it is easy to understand precision and recall. Note I wrote: “understand.” I care about the intuition and understanding, not the calculation. You can always google the equation or just use your favourite tool to calculate that.

Imagine that a radar is a classifier. A classifier that classifies a point in space and returns one of two verdicts: “aircraft” or “empty.”

Let’s assume that we work for the army and we are supposed to build a radar (a classifier) to detect those airplanes.

Aircraft to be detected by the classifier
Aircraft to be detected by the classifier

If we build a classifier which finds all airplanes and does not classify an empty point as an aircraft we have the following output.

It is a perfect classifier. Precision = 1, recall = 1 We have found all airplane and we have no false positives.

Perfect precision and recall
Perfect precision and recall

On the other hand, if we have an output which looks like this:

Perfect precision — all green dots are airplanes. Not so good recall — there is more airplanes.
Perfect precision — all green dots are airplanes. Not so good recall — there is more airplanes.

We have perfect precision once again. All points reported as an airplane are in fact airplanes. The only problem is a terrible recall. We have not found all airplanes. If those are enemy’s aircraft, we have a huge problem.

There is one more mistake we can make. Is a classifier good enough if it finds all airplanes, but also reports a lot of empty spots as airplanes?

We detected all enemy aircraft. The only problem is, we waste a lot of jet fuel because we scramble our fighter aircraft way too often and they chase a non-existing enemy.

Perfect recall — the classifier detected all airplanes. Terrible precision — a lot of false positives.
Perfect recall — the classifier detected all airplanes. Terrible precision — a lot of false positives.

What is more important? Precision or recall?

There are two correct answers to that question:

  • both

  • it depends

In the case of the radar, I assume we need to detect all airplanes, and we can accept some false positives. It is better to chase after a non-existing enemy than allow one of their aircraft to be not intercepted.

Would you like to help fight youth unemployment while getting mentoring experience?

Develhope is looking for tutors (part-time, freelancers) for their upcoming Data Engineer Courses.

The role of a tutor is to be the point of contact for students, guiding them throughout the 6-month learning program. The mentor supports learners through 1:1 meetings, giving feedback on assignments, and responding to messages in Discord channels—no live teaching sessions.

Expected availability: 15h/week. You can schedule the 1:1 sessions whenever you want, but the sessions must happen between 9 - 18 (9 am - 6 pm) CEST Monday-Friday.

Check out their job description.

(free advertisement, no affiliate links)


To prove that the tradeoff between precision and recall is, in fact, a business decision, let’s look at an example of a product that needs both precision and recall to be average.

What if we were developing a dating website? What if after registration your clients filled out a survey that you use to train a classifier. The classifier predicts whether another person is a good or bad match for that user. (Yes, I know. It is not the way such websites work. It doesn’t matter. I need an example ;) )

We don’t want a perfect precision. If we had such classifier, the user would see only a few results (the ideal matches), found Mr. or Ms. Right immediately and never went back to your page. We don’t want that.

We can’t have too good recall either because it is always a tradeoff. Having the perfect recall would most likely require allowing many false positives (terrible precision). If you display many irrelevant results, the user will be disappointed and will run away to your competition.

What we want is precision/recall that gives the user some hope, so they return to your page every day and use it for as long as possible. Now you also know why dating websites are scam ;)

Remember to share on social media!
If you like this text, please share it on Facebook/Twitter/LinkedIn/Reddit or other social media.

If you want to contact me, send me a message on LinkedIn or Twitter.

Bartosz Mikulski
Bartosz Mikulski * MLOps Engineer / data engineer * conference speaker * co-founder of Software Craft Poznan & Poznan Scala User Group

Subscribe to the newsletter and get access to my free email course on building trustworthy data pipelines.