MLOps engineer, you will need those three books every day!
Do you struggle with MLOps? Does everything seem complicated and weird? It doesn’t have to be like this.
Fortunately, MLOps is a mix of DevOps, data engineering, and ML. If we apply patterns and principles from those three areas, our lives will get much easier.
How do you learn those principles? I recommend reading three books.
Ok… not only reading them. I recommend keeping those books on your desk and referring to them several times a week. You’ll find 80% of the answers you need in them. For the remaining 20%, you’ll need the “learning by doing” approach.
I won’t summarize those books. First of all, I would have to copy their entire content. Second, it would be challenging to pick the important parts because all of them can be important at the right time.
Anyway, let’s get started. What is the first book?
97 Things Every Data Engineer Should Know
We’ll start with data engineering because data is the most important.
If you can’t get clean data to train the ML models, everything else doesn’t matter.
If you don’t know how your training data was produced, you can’t trust the model.
If you don’t know how to get updated data to retrain the model, you’ll get stuck with the slowly decaying model.
If your data pipeline isn’t properly tested, for me, it’s indistinguishable from a random data generator.
If you fail to identify biases in your data, your model will hurt people.
I recommend the book not only because I contributed one chapter. Although, in my not so humble opinion, you should start with reading my chapter. It’ll put everything else you learn in the proper perspective.
A proper data engineering isn’t nice-to-have. It’s crucial!
If you can’t build a trustworthy data pipeline to get the training data, do the world a favor; remove your model and go home. I’m dead serious. The time of doing machine learning for fun has passed. Today, ML affects people’s lives.
I can’t pinpoint one area covered in this book because it says a bit of everything related to data engineering. That’s the reason why I recommend it. It covers multiple areas, so most likely, you’ll find something applicable in your situation.
ML Design Patterns
If you ever wonder, “How do I do X in ML?” you should look at the “Machine Learning Design Patterns” book.
To be honest, it’s not revolutionary. You can find all of the information online. So why should you have this book?
It comes in handy when you don’t even know what you’re looking for. Also, it’s helpful when you want to fish out the information written by experts from the sea of texts authored by people who have no clue about ML and repeat the same old cliches.
When I started with MLOps engineering, I was thankful for all of the information from the “Design Patterns for Resilient Serving” section. This part teaches you what qualities you should expect from the ML deployment platform. From the “Reproducibility Design Patterns” chapter, I learned about feature stores and model versioning.
Overall, the book shows you how to recognize a good quality ML solution. It’s useful when you have to evaluate a few approaches to a problem and figure out which one is the best.
We don’t have new problems in MLOps. Many of the things we struggle with have already been solved by backend engineers when the DevOps movement got popular.
Let’s not reinvent the wheel because, as Daniel Molnar says, we may reinvent a flat tire.
We should build on top of existing good practices for distributed software systems, the principles of creating stable software, and building the reliability culture.
Yes, the culture. Because all of those books are useless if you treat them as a checklist or a source of backlog tasks. We need a permanent culture shift in the MLOps teams.
Not so long ago, it was acceptable to get data as a CSV file, copy-paste pieces of code from tutorials, and glue them together in Jupyter Notebook.
We aren’t here to get the model from ML engineers, hack a semi-stable runtime environment, deploy the model, and hope for the best.
MLOps is all about creating proper software engineering practices around machine learning.
You may also like
- How to deploy a Transformer-based model with custom preprocessing code to Sagemaker Endpoints using BentoML
- Multimodel deployment in Sagemaker Endpoints
- What is MLOps? A quite controversial definition
- Deploying your first ML model in production
- Shadow deployment vs. canary release of machine learning models