3 DevOps Lessons Learned While Scaling
July 15, 2016

Adam Serediuk
xMatters

Throughout my 15 years in operations I've noticed the same dilemma pop up within most organizations that practice traditional software development – many companies have gotten into the habit of "triage" development. They react to problems defensively, and because the strategy is reactive, triage development occupies most of the ops team's time.

When the ops team is constantly putting out fires, they have no time, people, or tools left for building the actual product. Make no mistake, the operational component is equally important to the product. Recognizing the cost of constant fire-fighting, many teams take a more proactive approach and try to predict breakdowns.

Failures are inevitable. However by implementing a few best practices, operations professionals can take back control of their output and build more resilient systems. Here are three lessons I've learned that have allowed my teams to spend more time building up, rather than hunkering down:

1. Automate the B.S.

We've already established that spending your time putting out fires is a sure-fire way to guarantee you'll never be productive. So, should you automate your system completely and eliminate the need for human intervention in incidents? Maybe, but that's an awfully lofty goal to start. You may not need total automation.

The key is to find the right areas to automize. This should be determined strategically, but there will also be an element of trial and error. If you put automation in the wrong place or discover a newer, better way of doing things, don't be afraid to throw code out.

Some people think that constant change causes lines and logic to become muddled. I happen to believe the opposite. Constant change can eliminate problems. Ops teams should not be intimidated by change. Embracing change on a regular basis will make it less scary, and you'll see fewer fires as a result.

This must be balanced with automated testing and QA of Ops code and infrastructure. The same software development approach to unit testing and test plans can and should be applied to Ops, to eliminate regression and enable confidence in change.

2. Go Beyond IT Automation

Repeat after me: Deploying an enterprise IT automation platform is not the same as adopting DevOps. Developers, systems administrators and operations professionals use these platforms to manage the continuous integration/delivery pipeline that defines agile software development and manage system environments. While IT automation platforms are important for DevOps practitioners, they are in no way the foundation of the model.

Give equal focus to the process – the build, test, release, deploy and monitoring lifecycle – so you can iterate quickly on changes. I've seen far too many DevOps teams focus only on their automation code without giving adequate attention to the software development process and how this code fits into the larger picture. This means having ongoing conversations with your teams, QA, Development and Ops alike.

By ensuring conversations are ongoing, DevOps teams can deploy IT automation without making the situation more complicated for themselves. The team should be able to understand each deployment framework or tool selected to run automation code and where it fits into the big picture. This may mean team meetings and regular messaging, but, hey, communication is what DevOps is all about.

3. Reset Your Definition of Done

As we've seen, constant change is essential for avoiding problems. That's why startups are passing over traditional development for soft releases and continuous, everyday delivery. It's also more stimulating for the team when every day is different, and releasing small, incremental changes is safer than large monolithic releases. Recognizing the changing tide of software development, the industry has developed deployment tools that have unit and integration testing baked into them.

Thanks to these new tools, IT professionals are able to complete tasks to a fuller extent. Not only do they build, they also test and launch. With this power comes responsibility; there's no excuse for leaving anything short of "done." This is where product owners can enable and support the process, by giving equal importance to uptime, continuous delivery and testing as part of story planning.

One of my past roles was Lead Operations Engineer at a mobile and online gaming company, where my team and I built the software and cloud infrastructure. Recently, I spoke to my former boss and he noted how the code I built performed reliably in the two years since I left. The secret was developing a cohesive system, a complete package through continuous iteration. Investing in this process allowed us to build a product that adhered to the new definition of "done", which made it good enough to last.

Which brings us to what might be the most important lesson of all: software code should not exist separately from infrastructure code. The reason is simple – infrastructure without software is pointless and software can't exist without infrastructure.

Conway's Law states that "organizations which design systems ... are constrained to produce designs which are copies of the communication structures of these organizations." In short, if teams don't talk to each other, parts won't talk to each other. At the same time, if a product is built to work as a whole, it will work. So, why not put the pieces together?

Adam Serediuk is Director of Operations at xMatters.

The Latest

May 17, 2018

The top two business priorities for CIOs of midsize enterprises (MSEs) in 2018 are growth and digital transformation. However, 57 per cent of MSEs are not yet delivering digital initiatives, according to findings from Gartner Inc.'s 2018 CIO Agenda Survey ...

May 15, 2018

Almost every company is facing the challenge of digital transformation today. This means rethinking and retooling your company to compete and succeed in an increasingly digital world. While digital transformation is not only about technology, the right tools can help. To find out what these right tools are, APMdigest asked experts from across the IT industry for their opinions on the essential tools to support digital transformation ...

May 08, 2018

With data breaches consistently being in the news over the last several years, it is no wonder why data privacy has become such a hot topic and why the European Union (EU) has put in place General Data Protection Regulation (GDPR) which will become enforceable on May 25, 2018, which is less than a month away ...

May 03, 2018

The prospect of increased workloads, combined with shrinking mainframe skillsets, has huge implications for mainframe DevOps. The only way for organizations to solve this skills gap crisis is by optimizing developer productivity. Drilling down a level further, what does this all mean for mainframe DevOps? ...

May 02, 2018

When it comes to operations and development, DevOps has changed the traditional compartmentalized style of development by eliminating silos. But what about the security team? Security is largely still siloed from operations and development. No doubt, many DevOps teams have some security controls baked into their automation processes, but a recent survey shows there are still alarming gaps ...

April 30, 2018

According to the 2018 Global Security Trends in the Cloud report, 93 percent of respondents faced challenges when deploying their current on-premises security tools in the cloud, and 97 percent lacked the tools, cross-functional collaboration and resources to gain proper insight into security across the organization. These numbers indicate a big problem in DevSecOps that needs to be addressed ...

April 26, 2018

Moving more workloads to the cloud is a top IT priority, so eventually it will be time to consider how to make those critical legacy applications cloud ready. In Part 1 of this blog, I outlined the first four of eight steps to chart your cloud journey. In addition, consider the next four steps below ...

April 25, 2018

Clearly, moving applications to the cloud delivers significant advantages. So what's standing in the way of full cloud adoption? For many companies it's those burdensome (but critically important) legacy applications. Moving more workloads to the cloud is a top IT priority. So, eventually it will be time to consider how to make those critical legacy applications cloud ready. Consider the following eight steps to chart your cloud journey ...

April 24, 2018

Developers and engineering teams are under increasing pressure to release higher quality software faster. Continuous testing has proven to be central to these efforts as it helps eliminate bottlenecks and ensures that automated testing is a constant throughout the development process, not an exercise relegated to the "last mile." The value of automated testing is more evident than ever before, with nearly half the respondents reporting that management is fully committed to automated testing and with plans to increase spending, according to the recent Sauce Labs Testing Trends for 2018 report ...

April 19, 2018

As development speed has become a competitive advantage, the DevOps team has sought to enable continuous integration and continuous delivery (CI/CD). For the CI/CD process to be successful, it must be fast and efficient. Any potential roadblocks that delay any part of the process increase cycle times and slow down delivery ...

Share this