StackPulse Debuts Automated Kubernetes Troubleshooting and Remediation Tools
April 28, 2021

StackPulse announced a Kubernetes-centric “operations center” initiative as a part of its Reliability platform.

With these additions, StackPulse gives organizations running Kubernetes a powerful set of capabilities to augment their existing incident response practices, helping Site Reliability Engineers (SRE) understand and investigate issues faster, and deploy well-tested outage mitigation strategies, helping prevent customer-facing downtime.

Since Kubernetes is the de-facto standard for running containerized applications, StackPulse wanted to create a set of code-based tools engineers could use to operationalize incident response for production Kubernetes-based applications. When an error is detected in a Kubernetes environment, StackPulse automatically executes diagnostic steps to gather information from the clusters, and assists engineers in performing the root-cause analysis. This automation helps them quickly identify how to mitigate and resolve an issue.

Additionally, StackPulse has released more than a dozen playbooks built by SRE experts that remediate common Kubernetes problems. Using the StackPulse platform to automate these playbooks significantly reduces the time to resolution, helping teams restore services faster and meet SLOs.

“If you're serious about cloud-native, you're using Kubernetes, but it requires learning new concepts, and turning applications alongside infrastructure for best performance,” said Leonid Belkind, CTO and Co-Founder of StackPulse. “While developer teams push to adopt K8s due to the benefits in velocity it brings, it can be hard for Ops teams or on-call developers to know how to respond to alerts, or fix issues in production. This leads to costly incidents and outages. What we’re releasing today is a set of automated tools for diagnostics, mitigation, and remediation that help any Kubernetes environment operate with the best practices of planet-scale Kubernetes shops.”

All the Kubernetes tools and automated diagnostics are available to teams in the same platform as StackPulse's incident response functionality so teams can communicate during outages, centralize event data, and take action to remediate. From detecting issues by correlating signals from multiple sources to enriching alerts sent to on-call teams with root cause and remediation information, StackPulse drastically decreases the customer impact of production issues, helping stop outages in their tracks.

Share this

Industry News

April 16, 2025

CodeSecure and FOSSA announced a strategic partnership and native product integration that enables organizations to eliminate security blindspots associated with both third party and open source code.

April 16, 2025

Bauplan, a Python-first serverless data platform that transforms complex infrastructure processes into a few lines of code over data lakes, announced its launch with $7.5 million in seed funding.

April 15, 2025

Perforce Software announced the launch of the Kafka Service Bundle, a new offering that provides enterprises with managed open source Apache Kafka at a fraction of the cost of traditional managed providers.

April 14, 2025

LambdaTest announced the launch of the HyperExecute MCP Server, an enhancement to its AI-native test orchestration platform, HyperExecute.

April 14, 2025

Cloudflare announced Workers VPC and Workers VPC Private Link, new solutions that enable developers to build secure, global cross-cloud applications on Cloudflare Workers.

April 14, 2025

Nutrient announced a significant expansion of its cloud-based services, as well as a series of updates to its SDK products, aimed at enhancing the developer experience by allowing developers to build, scale, and innovate with less friction.

April 10, 2025

Check Point® Software Technologies Ltd.(link is external) announced that its Infinity Platform has been named the top-ranked AI-powered cyber security platform in the 2025 Miercom Assessment.

April 10, 2025

Orca Security announced the Orca Bitbucket App, a cloud-native seamless integration for scanning Bitbucket Repositories.

April 10, 2025

The Live API for Gemini models is now in Preview, enabling developers to start building and testing more robust, scalable applications with significantly higher rate limits.

April 09, 2025

Backslash Security(link is external) announced significant adoption of the Backslash App Graph, the industry’s first dynamic digital twin for application code.

April 09, 2025

SmartBear launched API Hub for Test, a new capability within the company’s API Hub, powered by Swagger.

April 09, 2025

Akamai Technologies introduced App & API Protector Hybrid.

April 09, 2025

Veracode has been granted a United States patent for its generative artificial intelligence security tool, Veracode Fix.

April 09, 2025

Zesty announced that its automated Kubernetes optimization platform, Kompass, now includes full pod scaling capabilities, with the addition of Vertical Pod Autoscaler (VPA) alongside the existing Horizontal Pod Autoscaler (HPA).

April 08, 2025

Check Point® Software Technologies Ltd.(link is external) has emerged as a leading player in Attack Surface Management (ASM) with its acquisition of Cyberint, as highlighted in the recent GigaOm Radar report.