Re: DataOps Principles: How Startups Do Data The Right Way

Recently, I found an article about DataOps, written by Brad Ito. In the article, Brad suggests adopting a new kind of approach to data. Brad calls it “DataOps.” It is supposed to empower individuals and relays heavily on automation.

DevOps

The idea seems to be good. The only thing that concerns me is how badly we failed at adopting DevOps. In many companies, it was done by renaming the role of system administrators to DevOp. It is a failure because DevOps was supposed to be a new organizational culture, not a new fancy name for an already existing position.

DevOps was supposed to break silos within organizations. It should no longer be required to have a separate role for people who set up the production or development environments. There should not be designated database administrators.

Moreover, there should be no developers and testers. All of those people form one team. Every one of them should be able to do the work of someone else. They may specialize, but in general, the group as a whole should be able to operate even when the specialist is not available.

DataOps

I would expect something similar from DataOps, but it does not seem that anything like this is going to happen any time soon. We keep dividing the data teams.

First, we had data scientists and data engineers. At some point, the role of machine learning engineer emerged. I have no idea why it exists. For me, it seems that some data engineers wanted to do machine learning but were not allowed, so they made up a new role. I have even seen “AI engineer” job offers.

If we genuinely want to build something like DataOps, it is a totally wrong approach. We should not be splitting data roles into small and extremely specialized positions which cannot do anything on their own because need support of four different specialists to deliver value to the users.

That is completely wrong. People should specialize, but at the same time, we must make sure that nobody becomes a bottleneck.

The real DevOps/DataOps

For me, that is the true spirit of DataOps. We should build teams which operate like soldiers. In the military people specialize too. They have snipers, medics, machine gun operators, radio operators, drivers, commanders, explosives specialists, etc. but every one of them can do the job of everyone else.

All of the soldiers can shoot or throw a grenade. If something happens to the medic, they won’t bleed out because everyone else knows a lot about first aid. All of them know how to use the radio. There is a chain of command, so they can get the job done even if the team leader gets killed. For sure medics are not going to be great snipers, but they may be good enough.

We need software teams which work similarly. It should be unacceptable to have a team which cannot make any progress because the front-end developer is sick. It should be unacceptable to have data scientists sitting doing nothing because they wait until the data engineering people import data from a database.

Older post

From Scala to Python - Python dataclasses

Domain model in Python

Newer post

Genetic algorithms in Scala - solving optimization problems

Using Helisa and Jenetics to help Fallout players