A Guide to Stateful Kubernetes: Federation and Multi-Cluster Explained

February 09, 2021

Cyril Plisko
Replix

So, you've finally decided to use Kubernetes for stateful applications? Congrats! (And good luck.)

But first, let's put the Champagne back on the ice and talk about data — the chain that binds your stateful architecture to a single location. If you're only using a single region, you're in luck, but what happens when the same application needs to run on multiple regions? Or, even worse, multiple clouds?

Stateless applications use service meshes so that the application layer could communicate across clusters. But stateful applications are a different animal. They require you to have available synced data.

Now you are faced with some tough questions. How can you ensure that your application is running consistently if the distance between the application and its data varies? How would you solve that issue? Perhaps it’s better to not venture into that mess at all?

Stateful Challenges Come at Scale. Remember the CAP Theorem?

If I'm running a database on a Kubernetes cluster, all the pods require access to a local volume to store and read data. In other words, any entry that was done with one pod should be seen by the rest of the pods.

Consistency, or the requirement that every read is updated with the latest write, sounds simple. But if your goal is to enjoy the true benefit of distributed network availability, limiting yourself to applications that run close to their data with as little room as possible for error is not enough.

Not a problem, you might say. I'll set up a centralized database to take care of all my pod's and cluster’s stateful requests.

Congrats, you've just introduced a single point of failure to unite them all; if something happens, none of your pods will have access to data, a double-edged sword that breaks the partition tolerance.

Balance is key, and the tradeoff between consistency, availability, and partitioning is of paramount importance. Could we solve this by simply adding another cluster?

What is Multi-Cluster, and What to Do with It?

Once you've designed and coded your application and you've built containers, in theory, all that is left is the simple task of running them. But getting from code to up and running is not nearly as simple, as anyone who has ever built a containerized application will attest.

Before deploying to the production environment, you need to run various dev/test/stage cycles. You also need to think of scale — your production application may need to run in many different places for reasons like horizontal scalability, resiliency, or close proximity to end-users.

Multi-cluster is a deployment strategy that runs multiple Kubernetes clusters. Running multiple clusters is common, but the issues start when you need pods to communicate with one another.

Multi-cluster is a strategy to deploy containerized applications across multiple Kubernetes Clusters.

Multi-cluster use cases:

■ Improved application availability: A cluster that does not have another cluster is a source of failure. Having multiple cloned clusters that can failover in case a main cluster is damaged provides higher regional performance.

■ Support for large organizations: Running multiple clusters in different environments. Multi-clustering will consolidate all clusters into a single management portal, giving the ability to deploy applications across multiple availability zones and clusters. By standardizing the cluster creation across environments, overhead can be reduced as well as time to market for features and updates. In addition, multi-cluster deployments are easily scalable.

■ Isolation: The ability to un-multicluster by creating individual fault domains. Updating of the clusters can be phased to reduce the impact of faulty versions or malicious code.

■ Performance: The closer the application's proximity to the end-user, the lower the latency and the risk to data in transit.

■ Compliance: There are laws in many countries that govern where you can store users' data. Depending on the regulations, you might have to store the data from users in China within the country. Having a system that spans multiple regions enables you to do just that. If you only have a data center in the US, then you're going to have a tough time working with a global user-base.

Federating Stateful Applications

The idea behind federation is to provide a single configuration to manage the application across multiple clusters or regions.

Federation use cases:

■ Reduced Configuration Management Complexity: A single location to consolidate cluster management. In this use case, the data is not shared across the application, and it works well for stateless applications.

■ High Availability (HA): Add cluster redundancy for business continuity (BCP), which is also a good solution for stateless applications.

Stateless applications enjoy the true benefits of multi-cluster and federated Kubernetes; stateful is a different story

The portability of stateless applications gives them the ability to run anywhere, but not all applications are stateless; most applications are dependent on data, data that does not act by the same rule book as stateless applications.

Data binds the application to its storage locations. A physical location becomes an app dependency, and every request from the data creates latency according to its distance from the application resulting in service inconsistency.

When it comes to stateful applications, you can solve those problems by treating your state just as you do your containers.Instead of forcing the application to run where the data happened to be originally provisioned, the data needs to follow the application.

Cyril Plisko is Founder and CTO of Replix

Industry News

Parasoft Adds New GenAI Innovation, Streamlines Compliance and Bolsters Support for C++ Developers of Safety-Critical, Security-Focused Applications

March 06, 2025

Parasoft(link is external) is showcasing its latest product innovations at embedded world Exhibition, booth 4-318(link is external), including new GenAI integration with Microsoft Visual Studio Code (VS Code) to optimize test automation of safety-critical applications while reducing development time, cost, and risk.

JFrog Integrates with NVIDIA NIM Microservices

March 06, 2025

JFrog announced general availability of its integration with NVIDIA NIM microservices, part of the NVIDIA AI Enterprise software platform.

CloudCasa by Catalogic Introduces SUSE Rancher Prime Extension

March 06, 2025

CloudCasa by Catalogic announce an integration with SUSE® Rancher Prime via a new Rancher Prime Extension.

MacStadium Orka Cluster 3.2 Now Available on AWS and On-Prem

March 05, 2025

MacStadium(link is external) announced the extended availability of Orka(link is external) Cluster 3.2, establishing the market’s first enterprise-grade macOS virtualization solution available across multiple deployment options.

JFrog Integrates with Hugging Face

March 05, 2025

JFrog is partnering with Hugging Face, host of a repository of public machine learning (ML) models — the Hugging Face Hub — designed to achieve more robust security scans and analysis forevery ML model in their library.

Copado Announces DevOps Automation Agent on Salesforce AgentExchange

March 05, 2025

Copado launched DevOps Automation Agent on Salesforce's AgentExchange, a global ecosystem marketplace powered by AppExchange for leading partners building new third-party agents and agent actions for Agentforce.

Harness and Traceable Complete Merger

March 05, 2025

Harness completed its merger with Traceable, effective March 4, 2025.

JFrog ML Released

March 04, 2025

JFrog released JFrog ML, an MLOps solution as part of the JFrog Platform designed to enable development teams, data scientists and ML engineers to quickly develop and deploy enterprise-ready AI applications at scale.

Progress Unveils Fully Managed Web Application Firewall for MOVEit Cloud

March 04, 2025

Progress announced the addition of Web Application Firewall (WAF) functionality to Progress® MOVEit® Cloud managed file transfer (MFT) solution.

Couchbase Edge Server Released

March 04, 2025

Couchbase launched Couchbase Edge Server, an offline-first, lightweight database server and sync solution designed to provide low latency data access, consolidation, storage and processing for applications in resource-constrained edge environments.

Sonatype Releases AI Software Composition Analysis

March 04, 2025

Sonatype announced end-to-end AI Software Composition Analysis (AI SCA) capabilities that enable enterprises to harness the full potential of AI.

Aviatrix Kubernetes Firewall Releases

March 03, 2025

Aviatrix® announced the launch of the Aviatrix Kubernetes Firewall.

ScaleOps Releases Smart Pod Placement

March 03, 2025

ScaleOps announced the general availability of their Pod Placement feature, a solution that helps companies manage Kubernetes infrastructure.

Cloudsmith Raises $23M in Series B Funding

March 03, 2025

Cloudsmith raised a $23 million Series B funding round led by TCV, with participation from Insight Partners and existing investors.

IBM Completes Acquisition of HashiCorp

February 27, 2025

IBM has completed its acquisition of HashiCorp, whose products automate and secure the infrastructure that underpins hybrid cloud applications and generative AI.

DEVOPSdigest

Stateful Challenges Come at Scale. Remember the CAP Theorem?

What is Multi-Cluster, and What to Do with It?

Federating Stateful Applications

Industry News

Upcoming Webinars

On-Demand Webinars

Analyst Reports

White Papers

Media Partners

The Latest

Hot Topics

Stateful Challenges Come at Scale. Remember the CAP Theorem?

What is Multi-Cluster, and What to Do with It?

Federating Stateful Applications

Related Links

Industry News

Search form

Upcoming Webinars

On-Demand Webinars

Analyst Reports

White Papers

Media Partners

User login

The Latest

Hot Topics