Recommendations vs. raw data — what is better?

Is it better to show the users recommendations or raw data? On the one hand, raw data allows them to make their conclusions. It gives them control. It may even give people the satisfaction of discovering some insights, even if those insights very obvious. On the other hand, what if you want them to do a specific action and they misinterpret the data?

In the “Data Nerdism at Large” episode of the DataFramed podcast, Mara Averick says that data visualization can be helpful and misleading at the same time.


She talks about a study of the toxicity of chemicals which polluted one community.

After the study, the researchers showed the people a scatter plot with a line which marks the threshold at which the substances become cancerogenic. On the same chart, there were points which represented other people in the study and the one point that represented the household of the person who was looking at the chart.

Surprisingly often, somebody who was way, way above the threshold did not worry about it as long as they saw that they were below other houses in their neighborhood.


What problem does it show? There are two ways of looking at this situation. Which one you choose is in my opinion strongly correlated with your place on the political spectrum, but I cannot prove that.

What if the people understood the significance of that information, but they ignored it because that was a more comfortable approach? They saw that other people are in a worse situation. That justified their lack of action, so they anchored to that information and neglected everything else.

Maybe we have a problem with understanding that if we are dying it does not matter that other people are dying faster.

That is one explanation. Let’s look at the second one.

What if the people who participated in the study understood the data, knew that they have a huge problem, but for some reason were unable to deal with it, so they were looking for something that gave them hope?

What if the only recommendation they could get was “move somewhere else”? If they live in such a nasty area, we can assume that moving out was beyond their ability. What if they ignore the advice because they cannot afford to take action?

Would you like to help fight youth unemployment while getting mentoring experience?

Develhope is looking for tutors (part-time, freelancers) for their upcoming Data Engineer Courses.

The role of a tutor is to be the point of contact for students, guiding them throughout the 6-month learning program. The mentor supports learners through 1:1 meetings, giving feedback on assignments, and responding to messages in Discord channels—no live teaching sessions.

Expected availability: 15h/week. You can schedule the 1:1 sessions whenever you want, but the sessions must happen between 9 - 18 (9 am - 6 pm) CEST Monday-Friday.

Check out their job description.

(free advertisement, no affiliate links)

Recommendation vs. raw data

That brings us back to the original question. Is it better to show raw data or to give recommendations?

In my opinion, we should offer a summary of the data (perhaps in the form of a recommendation), the raw data, and some explanation of the process we followed to get the recommendation. It seems that we should emphasize the importance of explainability not only in machine learning models, but also in data visualizations.

Most importantly, we should never judge the consumers of our recommendations. They are doing what is best for them. Even if their definition of “best” is entirely different than ours.

Remember to share on social media!
If you like this text, please share it on Facebook/Twitter/LinkedIn/Reddit or other social media.

If you want to contact me, send me a message on LinkedIn or Twitter.

Bartosz Mikulski
Bartosz Mikulski * MLOps Engineer / data engineer * conference speaker * co-founder of Software Craft Poznan & Poznan Scala User Group

Subscribe to the newsletter and get access to my free email course on building trustworthy data pipelines.