First Principles for the MLOps Engineer

June 27, 2022

Taimur Rashid
Redis

Launching an airplane from an aircraft carrier is a systematic and well coordinated process that involves reliable systems, high-performance catapults, precise navigation systems, and above all, a specialized crew having different roles and responsibilities for managing air operations. This crew, also known as the flight deck crew, are known for their colored jerseys to visually distinguish their functions. Everyone associated with the flight deck has a specific job. As a corollary to this example, launching machine learning (ML) models into production are not entirely different, except instead of launching a 45,000-pound plane into air, ML teams are launching trained ML models into production to serve predictions.

There are several categorizations that define this function of enabling the whole process of taking trained ML models and launching them into production. One of those definitions is MLOps engineering and can be defined as the technical systems and processes associated with the stages of the ML lifecycle (also referred to as MLOps cycle) from data preparation, modeling building, and production deployment and management.

While MLOps engineering entails the provisioning, deployment, and management of infrastructure that enables model building, data labeling, and model inference, it can go much deeper than that. MLOps engineering can entail developing algorithms too.

Mature IT functions like data engineering, data preparation, and data quality all have corresponding personas that perform specific tasks, or in the frequently mentioned parlance, "Jobs to Be Done(link is external)."

ML engineering also has a specific persona, and that is the MLOps Engineer. What do MLOps Engineers do?

For the sake of simplicity, MLOps Engineers design, deploy, and operate the underlying systems (infrastructure) that allow data science teams to do their jobs, which include feature engineering, model training, model validation, model refinement, just to name a few. MLOps Engineers also automate the process around those specific needs so that the work involved in launching ML models into production is streamlined, simplified, and instrumented.

Just like any other IT role, there is a broad spectrum of functional tasks MLOps Engineers can undertake. Fundamentally, a MLOps Engineer fuses software engineering expertise with knowledge of machine learning.

While the number of tools, frameworks, and approaches continue to expand and evolve, there are certain skill sets that are needed, which transcend the specific tools and frameworks. That’s why it’s important to ground the discussion on first principles. There is a core list of skill sets needed for an MLOps Engineer to carry out the specific tasks, and while not all are required, the tasks an MLOps Engineer undertakes is a function of the existing composition, size, and maturity of the broader ML team.

Some of these first principles or core skill sets entail:

1. Programming experience

2. Data science knowledge

3. Familiarity with math and statistics

4. Problem-solving skills

5. Proficiency with machine learning and deep learning frameworks

6. Hands-on experience with prototyping.

Related to these core skill sets are knowledge and experience with programming languages, DevOps tools, databases (relational, data warehousing, in-memory, etc). There are a variety of online resources that unpack the details related to skill sets, and this continues to evolve as more companies mainstream ML across their teams.

While definitions are important, the industry is still early in defining MLOps engineering and better characterizing the roles and responsibilities of a MLOps Engineer. In the journey towards understanding this domain, and the associated education and learning paths to become a MLOps Engineer, it’s important to not be too dogmatic across the board. By focusing on the Jobs to Be Done, and applying that to the context of the project, company process, and maturity of teams, companies can better structure and define the MLOps engineering crew that can launch ML models into production.

Taimur Rashid is Chief Business Development Officer at Redis

Industry News

Oracle and Microsoft Add New Services to Oracle Database@Azure

March 20, 2025

Oracle announced the general availability of Oracle Exadata Database Service on Exascale Infrastructure on Oracle Database@Azure(link sends e-mail).

Perforce Acquires Snowtrack

March 20, 2025

Perforce Software announced its acquisition of Snowtrack.

Mirantis and Gcore Partner on AI Infrastructure

March 19, 2025

Mirantis and Gcore announced an agreement to facilitate the deployment of artificial intelligence (AI) workloads.

Amplitude Announces Session Replay Everywhere

March 19, 2025

Amplitude announced the rollout of Session Replay Everywhere.

Oracle Releases Java 24

March 18, 2025

Oracle announced the availability of Java 24, the latest version of the programming language and development platform. Java 24 (Oracle JDK 24) delivers thousands of improvements to help developers maximize productivity and drive innovation. In addition, enhancements to the platform's performance, stability, and security help organizations accelerate their business growth ...

Tigera Partners with Mirantis

March 18, 2025

Tigera announced an integration with Mirantis, creators of k0rdent, a new multi-cluster Kubernetes management solution.

SAP Introduces Joule for Developer

March 18, 2025

SAP announced “Joule for Developer” – new Joule AI co-pilot capabilities embedded directly within SAP Build.

SUSE Announces New Enterprise Linux Advancements

March 17, 2025

SUSE® announced several new enhancements to its core suite of Linux solutions.

Progress Releases Over 50 Free Enterprise-Grade UI Components for React Developers

March 13, 2025

Progress is offering over 50 enterprise-grade UI components from Progress® KendoReact™, a React UI library for business application development, for free.

Opsera Announces New Leadership Dashboard in Unified Insights

March 13, 2025

Opsera announced a new Leadership Dashboard capability within Opsera Unified Insights.

Cycloid Releases Components

March 13, 2025

Cycloid announced the introduction of Components, a new management layer enabling a modular, structured approach to managing cloud resources within the Cycloid engineering platform.

ServiceNow Announces Yokohama Release

March 12, 2025

ServiceNow unveiled the Yokohama platform release, including ServiceNow Studio which provides a unified workspace for rapid application development and governance.

Sonar Announces SonarQube Advanced Security

March 12, 2025

Sonar announced the upcoming availability of SonarQube Advanced Security.

ScaleOut Software Releases Version 4

March 12, 2025

ScaleOut Software introduces generative AI and machine-learning (ML) powered enhancements to its ScaleOut Digital Twins™ cloud service and on-premises hosting platform with the release of Version 4.

Next Generation of Kurrent Cloud Introduced

March 11, 2025

Kurrent unveiled a developer-centric evolution of Kurrent Cloud that transforms how developers and dev teams build, deploy and scale event-native applications and services.

DEVOPSdigest

Industry News

Upcoming Webinars

On-Demand Webinars

Analyst Reports

White Papers

Media Partners

The Latest

Hot Topics

Related Links

Industry News

Search form

Upcoming Webinars

On-Demand Webinars

Analyst Reports

White Papers

Media Partners

User login

The Latest

Hot Topics