DevOps and Kubernetes: We've Been Doing It Wrong
May 31, 2023

Tobi Knaup
D2iQ

Platform engineering as a replacement for DevOps has become a hot topic, with provocative critics stoking the controversy by pronouncing DevOps dead(link is external).

The underlying reason for these pronouncements is that the once-radical DevOps model is at odds with the new cloud-native container management model to which the now-obsolete DevOps model is being applied. Let's take a closer look.

A Misapplied Model

Container orchestration platforms and DevOps rose in popularity around the same time. DevOps was born because old centralized platforms like Java EE no longer worked for developers looking to leverage newer languages and development frameworks. When developers wanted to try new programming languages like PHP and Ruby, they weren't able to run those applications on Java EE. This is the context that bred the "you build it, you run it" mentality that is the foundation of DevOps.

The original goals of DevOps were to shorten the innovation cycle, increase agility, and ship more software faster by using automation and removing the Dev-to-Ops wall. However, the emergence of container orchestration platforms turned this shared ownership model upside down.

DevOps encourages decentralization, while container orchestration platforms were designed to be centrally managed for maximum benefit. Container orchestration platforms have carefully designed APIs that separate the concerns of developers and operators.

The original idea behind container orchestration platforms was that a central team provides a secure and resilient platform that abstracts away the complexity of infrastructure so each product team could focus on shipping and improving their products as fast as possible, without having to worry about how to operate them reliably, securely, and efficiently.

This is completely at odds with DevOps. In a sense, DevOps is trying to achieve the opposite of what container orchestration platforms were designed to do. Companies that try to take a maximalist approach with DevOps and encourage every team to build and run their own infrastructure will struggle and won't get the full value these platforms have to offer. They're doing it wrong.

A Hotbed of Inefficiency

The mismatch between traditional DevOps and the new cloud-native container orchestration model breeds a host of problems:

Overwhelming complexity. DevOps teams can't keep up with the amount of work required to manage a secure cloud-native platform. Kubernetes itself is already complex, but to create a production-grade Kubernetes-based platform requires many add-ons to cover functionality such as security, observability, service mesh, applications, and more, with each bringing their own complexity. By having to tend to a complex infrastructure, developers are left with little time to develop applications. Projects are delayed or never make it to production.

Duplicate and disjointed effort. Every DevOps team comes up with their own way to deploy and manage infrastructure, essentially reinventing the wheel. This lack of consistency and standardization wastes resources and undermines the goals of achieving efficiency and reducing costs. This siloed approach also prevents teams from learning from one another. For example, if one team fixes an important issue, other teams don't benefit from it.

Security and resiliency issues. Security and reliability engineering require specialized skills that many DevOps teams don't possess. This leads to insecure and unstable infrastructure.

Manual coding errors. DevOps team spend a good deal of time building and maintaining brittle custom scripts for infrastructure management that leave lots of room for human error and are expensive to maintain.

In this environment, cloud-native projects stall or fail, exacerbated by the complexity introduced by multi-cloud, hybrid, and edge environments, and compounded by complex workloads like artificial intelligence (AI) and machine learning.

These problems exist even when you're using a managed cloud-provider Kubernetes service. The services provide mainly bare bones Kubernetes and offer limited amounts of automation and fleet management capabilities. DevOps teams still need to add a multitude of Day-2 add-ons to create a production environment, and they must build their own automation for operational workflows.

Platform Engineering: A Better Path

Platform engineering is a new name for an old concept that has gained new relevance in the cloud-native era as a cure for cloud and cluster sprawl, wasted resources, and runaway costs.

Containers and WebAssembly (WASM) provide a clean interface, enabling developers to select any language and framework of their choice (unlike Java EE), while enabling a central team to set platform standards and governance.

A clean Kubernetes container-management interface separates the concerns of Dev and Ops, yielding more efficiency and productivity. I know what many developers are going to say: "Platform teams are just going to take our toys away again, take forever to give us something, and will just slow us down and make our lives miserable." But I think it's different this time around. Why? Because there's a simple contract, and the contract is that as long as it fits in a container, it can be deployed.

Cloud-provider Kubernetes services were designed for the DevOps approach. Teams that want to provide an internal developer platform (IDP) for their entire organization need a Kubernetes management platform that provides fleet management capabilities for their entire Kubernetes fleet, whether those clusters are provided by a cloud service like Amazon EKS, Microsoft AKS, or are running somewhere outside the cloud.

Platform Engineering Best Practices

To reap the full benefits of platform engineering, certain processes need to be centralized, while others should be decentralized.

The processes that should be centralized and standardized include:

Cluster lifecycle management. There is no value in having a dozen different ways to bring up a cluster. There is value in having a single way because it makes adding new infrastructure providers (like another cloud service) that much easier.

Security. Maintaining a secure Kubernetes environment requires a specialized skill set that is in short supply. Putting security experts on every team is not cost effective. Centralization is more efficient, enabling shared services like databases (database as a service, or DBaaS) to be secured and properly managed.

Governance/policy management. The point of policy is to be consistent across environments.

Observability. Ops teams need a "god view" of all their environments. This is critical for debugging and optimization. You want all your clusters to run like your best cluster.

Continuous delivery infrastructure. Best achieved through declarative APIs and GitOps.

Cost management. Best achieved through FinOps and integrated monitoring and management tools.

Processes that are better off decentralized and left for developers to decide include:

■ Choice of programming language

■ Choice of development framework

■ Basically anything that goes into a container

A Golden Path to Innovation

Platform engineering provides the best of both worlds in giving DevOps teams a centralized platform approach and decentralized DevOps. The Kubernetes API and containers provide a robust interface that enables division of labor and enables both sides to focus on what they do best.

So is DevOps really dead? Not at all! DevOps concepts such as automation through continuous integration and continuous delivery (CI/CD), site reliability engineering (SRE), and DevSecOps are still considered best practices for product teams. But when it comes to providing a secure, resilient, and cost-effective platform on which multiple teams can deploy their apps, a platform engineering approach makes more sense than the shared ownership model for which DevOps advocates.

Tobi Knaup is Co-Founder and CEO of D2iQ
Share this

Industry News

March 25, 2025

Chainguard announced Chainguard Libraries, a catalog of guarded language libraries for Java built securely from source on SLSA L2 infrastructure.

March 25, 2025

Cloudelligent attained Amazon Web Services (AWS) DevOps Competency status.

March 25, 2025

Platform9 formally launched the Platform9 Partner Program.

March 24, 2025

Cosmonic announced the launch of Cosmonic Control, a control plane for managing distributed applications across any cloud, any Kubernetes, any edge, or on premise and self-hosted deployment.

March 20, 2025

Oracle announced the general availability of Oracle Exadata Database Service on Exascale Infrastructure on Oracle Database@Azure(link sends e-mail).

March 20, 2025

Perforce Software announced its acquisition of Snowtrack.

March 19, 2025

Mirantis and Gcore announced an agreement to facilitate the deployment of artificial intelligence (AI) workloads.

March 19, 2025

Amplitude announced the rollout of Session Replay Everywhere.

March 18, 2025

Oracle announced the availability of Java 24, the latest version of the programming language and development platform. Java 24 (Oracle JDK 24) delivers thousands of improvements to help developers maximize productivity and drive innovation. In addition, enhancements to the platform's performance, stability, and security help organizations accelerate their business growth ...

March 18, 2025

Tigera announced an integration with Mirantis, creators of k0rdent, a new multi-cluster Kubernetes management solution.

March 18, 2025

SAP announced “Joule for Developer” – new Joule AI co-pilot capabilities embedded directly within SAP Build.

March 17, 2025

SUSE® announced several new enhancements to its core suite of Linux solutions.

March 13, 2025

Progress is offering over 50 enterprise-grade UI components from Progress® KendoReact™, a React UI library for business application development, for free.

March 13, 2025

Opsera announced a new Leadership Dashboard capability within Opsera Unified Insights.

March 13, 2025

Cycloid announced the introduction of Components, a new management layer enabling a modular, structured approach to managing cloud resources within the Cycloid engineering platform.