DevOps and Kubernetes: We've Been Doing It Wrong
May 31, 2023

Tobi Knaup
D2iQ

Platform engineering as a replacement for DevOps has become a hot topic, with provocative critics stoking the controversy by pronouncing DevOps dead.

The underlying reason for these pronouncements is that the once-radical DevOps model is at odds with the new cloud-native container management model to which the now-obsolete DevOps model is being applied. Let's take a closer look.

A Misapplied Model

Container orchestration platforms and DevOps rose in popularity around the same time. DevOps was born because old centralized platforms like Java EE no longer worked for developers looking to leverage newer languages and development frameworks. When developers wanted to try new programming languages like PHP and Ruby, they weren't able to run those applications on Java EE. This is the context that bred the "you build it, you run it" mentality that is the foundation of DevOps.

The original goals of DevOps were to shorten the innovation cycle, increase agility, and ship more software faster by using automation and removing the Dev-to-Ops wall. However, the emergence of container orchestration platforms turned this shared ownership model upside down.

DevOps encourages decentralization, while container orchestration platforms were designed to be centrally managed for maximum benefit. Container orchestration platforms have carefully designed APIs that separate the concerns of developers and operators.

The original idea behind container orchestration platforms was that a central team provides a secure and resilient platform that abstracts away the complexity of infrastructure so each product team could focus on shipping and improving their products as fast as possible, without having to worry about how to operate them reliably, securely, and efficiently.

This is completely at odds with DevOps. In a sense, DevOps is trying to achieve the opposite of what container orchestration platforms were designed to do. Companies that try to take a maximalist approach with DevOps and encourage every team to build and run their own infrastructure will struggle and won't get the full value these platforms have to offer. They're doing it wrong.

A Hotbed of Inefficiency

The mismatch between traditional DevOps and the new cloud-native container orchestration model breeds a host of problems:

Overwhelming complexity. DevOps teams can't keep up with the amount of work required to manage a secure cloud-native platform. Kubernetes itself is already complex, but to create a production-grade Kubernetes-based platform requires many add-ons to cover functionality such as security, observability, service mesh, applications, and more, with each bringing their own complexity. By having to tend to a complex infrastructure, developers are left with little time to develop applications. Projects are delayed or never make it to production.

Duplicate and disjointed effort. Every DevOps team comes up with their own way to deploy and manage infrastructure, essentially reinventing the wheel. This lack of consistency and standardization wastes resources and undermines the goals of achieving efficiency and reducing costs. This siloed approach also prevents teams from learning from one another. For example, if one team fixes an important issue, other teams don't benefit from it.

Security and resiliency issues. Security and reliability engineering require specialized skills that many DevOps teams don't possess. This leads to insecure and unstable infrastructure.

Manual coding errors. DevOps team spend a good deal of time building and maintaining brittle custom scripts for infrastructure management that leave lots of room for human error and are expensive to maintain.

In this environment, cloud-native projects stall or fail, exacerbated by the complexity introduced by multi-cloud, hybrid, and edge environments, and compounded by complex workloads like artificial intelligence (AI) and machine learning.

These problems exist even when you're using a managed cloud-provider Kubernetes service. The services provide mainly bare bones Kubernetes and offer limited amounts of automation and fleet management capabilities. DevOps teams still need to add a multitude of Day-2 add-ons to create a production environment, and they must build their own automation for operational workflows.

Platform Engineering: A Better Path

Platform engineering is a new name for an old concept that has gained new relevance in the cloud-native era as a cure for cloud and cluster sprawl, wasted resources, and runaway costs.

Containers and WebAssembly (WASM) provide a clean interface, enabling developers to select any language and framework of their choice (unlike Java EE), while enabling a central team to set platform standards and governance.

A clean Kubernetes container-management interface separates the concerns of Dev and Ops, yielding more efficiency and productivity. I know what many developers are going to say: "Platform teams are just going to take our toys away again, take forever to give us something, and will just slow us down and make our lives miserable." But I think it's different this time around. Why? Because there's a simple contract, and the contract is that as long as it fits in a container, it can be deployed.

Cloud-provider Kubernetes services were designed for the DevOps approach. Teams that want to provide an internal developer platform (IDP) for their entire organization need a Kubernetes management platform that provides fleet management capabilities for their entire Kubernetes fleet, whether those clusters are provided by a cloud service like Amazon EKS, Microsoft AKS, or are running somewhere outside the cloud.

Platform Engineering Best Practices

To reap the full benefits of platform engineering, certain processes need to be centralized, while others should be decentralized.

The processes that should be centralized and standardized include:

Cluster lifecycle management. There is no value in having a dozen different ways to bring up a cluster. There is value in having a single way because it makes adding new infrastructure providers (like another cloud service) that much easier.

Security. Maintaining a secure Kubernetes environment requires a specialized skill set that is in short supply. Putting security experts on every team is not cost effective. Centralization is more efficient, enabling shared services like databases (database as a service, or DBaaS) to be secured and properly managed.

Governance/policy management. The point of policy is to be consistent across environments.

Observability. Ops teams need a "god view" of all their environments. This is critical for debugging and optimization. You want all your clusters to run like your best cluster.

Continuous delivery infrastructure. Best achieved through declarative APIs and GitOps.

Cost management. Best achieved through FinOps and integrated monitoring and management tools.

Processes that are better off decentralized and left for developers to decide include:

■ Choice of programming language

■ Choice of development framework

■ Basically anything that goes into a container

A Golden Path to Innovation

Platform engineering provides the best of both worlds in giving DevOps teams a centralized platform approach and decentralized DevOps. The Kubernetes API and containers provide a robust interface that enables division of labor and enables both sides to focus on what they do best.

So is DevOps really dead? Not at all! DevOps concepts such as automation through continuous integration and continuous delivery (CI/CD), site reliability engineering (SRE), and DevSecOps are still considered best practices for product teams. But when it comes to providing a secure, resilient, and cost-effective platform on which multiple teams can deploy their apps, a platform engineering approach makes more sense than the shared ownership model for which DevOps advocates.

Tobi Knaup is Co-Founder and CEO of D2iQ
Share this

Industry News

January 16, 2025

Mendix, a Siemens business, announced the general availability of Mendix 10.18.

January 16, 2025

Red Hat announced the general availability of Red Hat OpenShift Virtualization Engine, a new edition of Red Hat OpenShift that provides a dedicated way for organizations to access the proven virtualization functionality already available within Red Hat OpenShift.

January 16, 2025

Contrast Security announced the release of Application Vulnerability Monitoring (AVM), a new capability of Application Detection and Response (ADR).

January 15, 2025

Red Hat announced the general availability of Red Hat Connectivity Link, a hybrid multicloud application connectivity solution that provides a modern approach to connecting disparate applications and infrastructure.

January 15, 2025

Appfire announced 7pace Timetracker for Jira is live in the Atlassian Marketplace.

January 14, 2025

SmartBear announced the availability of SmartBear API Hub featuring HaloAI, an advanced AI-driven capability being introduced across SmartBear's product portfolio, and SmartBear Insight Hub.

January 14, 2025

Azul announced that the integrated risk management practices for its OpenJDK solutions fully support the stability, resilience and integrity requirements in meeting the European Union’s Digital Operational Resilience Act (DORA) provisions.

January 14, 2025

OpsVerse announced a significantly enhanced DevOps copilot, Aiden 2.0.

January 13, 2025

Progress received multiple awards from prestigious organizations for its inclusive workplace, culture and focus on corporate social responsibility (CSR).

January 13, 2025

Red Hat has completed its acquisition of Neural Magic, a provider of software and algorithms that accelerate generative AI (gen AI) inference workloads.

January 13, 2025

Code Intelligence announced the launch of Spark, an AI test agent that autonomously identifies bugs in unknown code without human interaction.

January 09, 2025

Checkmarx announced a new generation in software supply chain security with its Secrets Detection and Repository Health solutions to minimize application risk.

January 08, 2025

SmartBear has appointed Dan Faulkner, the company’s Chief Product Officer, as Chief Executive Officer.

January 07, 2025

Horizon3.ai announced the release of NodeZero™ Kubernetes Pentesting, a new capability available to all NodeZero users.

January 06, 2025

GitHub announced GitHub Copilot Free.