What is machine learning operations (MLOps), and why is it important?
MLOps (or ML Ops) is a set of practices and tools that aims to make productionizing, deploying, and maintaining machine learning models in production reliable and efficient. To read in details about what MLOps are, click here.
MLOps can make a big difference for you if you recognize at least one of the following situations:
- It’s complex and nearly impossible to move ML applications from sandbox to production
- The time to market ML products is unsatisfying
- ML products often break and teams are reluctant to update them
- You often encounter problems that you’ve not anticipated when moving ML to production or supporting it
In our practice, applying MLOps to the ML projects significantly increases success rate, and drastically decreases time to market and time to getting the first value. And the best part - no surprises!
What is a machine learning operations canvas?
A machine learning operations (MLOps) canvas is a template used for describing and visualizing production systems with machine learning. It covers a wide range of aspects that have to be taken into account to sustainably run machine learning applications in production.
Product owners, managers of data science and ML engineering teams, analytics translators, and other business-oriented professionals can use the MLOps canvas to assess the current state of a machine learning project, and to plan efforts for operating or bringing it to production.
Data solutions architects, lead data scientists, lead ML engineers, and other senior tech specialists can use the canvas to refine requirements for ML applications and to communicate needs and decisions with the rest of an organization.
There are quite a few machine learning canvases and similar instruments out there. However, we felt that they lacked the focus on the aspects of machine learning operations. We made sure to add this to our MLOps canvas, and we also tried to augment it with a number of questions that help in creating a full vision of a current or future state of the machine learning system.
Xomnia’s MLOps canvas
We have created a canvas focused on machine learning operations. Although every aspect of a machine learning project deserves a separate document to describe it, it makes sense to focus on details related to running ML in the production stage.
We recommend business and technical users to collaborate in filling the canvas. Using our Canvas, we aim to help both sides get a broader perspective, foster asking the right questions, enable both groups to look ahead, and support designing lean systems.
When you download Xomnia’s MLOPs canvas, you will find two versions: The first contains a rather broad list of questions that aim to spark thoughts and curiosity, and the second is a version that is more suitable for being filled out. Naturally, some of the questions will be inapplicable to certain situations or to certain stages of the ML product lifecycle, so do not expect to fill every section for every machine learning project.
About the authors
This canvas is created by Xomnia's “Architects”, a group of machine learning experts and analytics translators led by Andrey Kudryavets, lead data engineer at Xomnia. Members of the group have strong backgrounds in strategic management, MlOps, software development, and data science, as well as years of experience in bringing and supporting machine learning systems in production for a broad range of clients (government; healthcare; defense, safety, and security; education; retail; and aviation). The canvas is the quintessence of practical experience and “hard” learnings of the group.