“If only the rest of IT moved as fast as we developers do!”

As applications and runtime platforms become more cloud native, the pace of development necessarily becomes faster. IT departments have sold the transition to the cloud as a self-service haven where development teams no longer suffer from the multi-day-ticket hell of days past and are freed to move at their own pace. Developers have long opined, “If only the rest of IT moved as fast as we do!” Well, be careful what you wish for. Now that developers have self-service infrastructure provisioning and runtime platforms with real APIs, they are realizing that their new reality is not quite the utopia they made it out to be. As Stan Lee taught us via Spider-Man, “With great power there must also come great responsibility,” but that is only half the story. The other half being, “With great responsibility there must also come a great deal of work.” Such are the realities of utopias.

So how do development teams who have long promised faster delivery of new features and business value increase their productivity when they are now burdened with the additional responsibility of managing infrastructure, deployments, monitoring, and incident response? The answer always comes back the same: automation, the mantra of DevOps. Successful practitioners of DevOps aggressively automate their development and operations. When done properly, automation reduces a team’s workload, reduces errors (or rapidly creates them), and provides living documentation of a team’s workflow. In this way, automation facilitates rapid dissemination of expertise across an organization, enabling teams to keep pace as demands on their time and skills increase.

Most development teams have automated test suites executing on every push by a continuous integration (CI) server like Jenkins, TeamCity, Travis CI, or CircleCI. Cloud-native teams often have extended their CI server to perform rudimentary deployment operations, typically using basic workflow engines, domain-specific languages (DSLs), and shell scripts. Sometimes these deployments involve provisioning infrastructure using tools like Terraform, SaltStack, or CloudFormation. Teams practicing DevOps often have automated alerting and sometimes responses, typically using the native capabilities of their monitoring platform, e.g., Datadog, Honeycomb, PagerDuty, AppDynamics, and Dynatrace, reducing the time they spend monitoring and fixing issues that arise. Automation that is hidden away is just magic, so making the tasks performed by automations visible to the team is essential. ChatOps provides a visible interface for automations, allowing teams to issue commands and see the work done by automations right in the chat client they use to communicate with their team. Most CI and monitoring solutions provide integrations with popular chat platforms like Slack, Mattermost, and Microsoft Teams.

devOPS

As you can tell from the classes of automations listed above, automations in DevOps have largely focused on the operation side of the methodology: deployments, monitoring, remediating, and reporting. Since much of the drive towards automation comes from the migration to the cloud, a platform designed by and for operators, this is not surprising. As a result, how we automate has largely been rooted in traditions from operations as well. Just as we configured server dæmons with configuration files and coordinated actions with shell scripts, so too our automations are based on tools that ingest YAML or a DSL to define the various pieces needed to successfully build, deploy, and operate a modern application: containers, datastores, load balancers, firewall rules, secrets, metrics, logging, etc. Invariably, the limitations of these operations-rooted solutions surface and we have to resort to, for example, executing a complicated CI build with a shell script, wrapping Terraform in a service that integrates with our directory server, or script the copying of an updated logging configuration to multiple microservice repositories. In this way, automation of operations in operational ways has come to define DevOps to date and the limitations of this approach are becoming manifest as teams migrate from a handful of on-prem deployable artifacts to dozens or even hundreds of microservices in the cloud. Rather than being more productive, as development, build, deployment, and operation best practices evolve over time, teams are either drowning in the deluge of menial maintenance tasks keeping build and deployment configurations up to date or suffering from the rot of having them diverge.

Fortunately, we have a solution to the current crisis brought on by the application of 2008 operations approaches to 2018 DevOps problems: development, the other half of our favorite portmanteau. The advances in software development practices and frameworks over the last decade combined with API-enabled cloud-native solutions provide fertile ground for improving the current practice of DevOps. As the orchestration of services, persistence stores, load balancers, and circuit breakers becomes more complex, rather than working around the limitations of YAML and DSLs, we can use real code. As an example, Pulumi allows developers to define cloud infrastructure in their favorite coding language. This approach is infinitely better, no shelling out or wrapping tools, and you already have all your favorite frameworks, libraries,  test harnesses, and tooling at your fingertips.

DevOps = Events

Events have always been at the core of DevOps, the push that triggers a CI build is the oldest and most basic DevOps event. The event-centric nature of DevOps meshes well with recent trends in software development toward streaming and queue-based data systems like Kafka and SQS. GitHub and other source code management systems have long supported posting webhook payloads for a variety of event types, e.g., pushes, pull request & issue activity, and deployments, allowing you to take action when your webhook endpoint receives a payload. With the recent announcement of GitHub Actions, GitHub makes the taking action part a lot easier: just provide a Docker container. GitHub handles the ingestion of the webhook event payload and the triggering and execution of the appropriate workflow.

Atomist combines the trends toward development and events, providing an event-based API for software delivery. In addition, Atomist takes an organization-wide view of events and code, greatly easing management of builds and deployments across many repositories. And lest the monorepo crusaders—you know who you are—look down upon the microservices practitioners and scoff at the troubles beset them by their dozens of repositories, let’s not forget that monorepos, whose builds and deployments are typically more complicated than microservice repositories, are no better served by untestable YAML and shell scripts. The only difference is that when you make a mistake in an untestable monorepo build script, you break everyone’s build.

Cloud-native delivery requires real code triggered by events

The progress made thus far from development vs. operations to DevOps is both desirable and laudable, but we cannot rest on our laurels. Applying software development best practices to the practice of DevOps is the future. Or, as Gene Kim put it, we should be “elevating the notion of what is possible through software APIs, liberating us from YAML files and Bash scripts.” Run through the Atomist Developer Quick Start today and be the friendly neighborhood DevOps superhero on your team!

Author's note: This blog post, including its references to Spider-Man, were written prior to Stan Lee's passing. He will be missed. Excelsior!