In my last post in our “Beyond DevOps” series, I explained why DevOps has become a constraint on our ability to deliver software to our clients.
Now I’d like to cover why we believe operability to be the most valuable part of DevOps.
What is operability, anyway?
At Equal Experts, we share a set of Technical Values. One of them is “we value software developed with a focus on operation and maintenance in production”, and it’s worth considering it in full:
“Software delivers little to no value unless it is deployed and running in production, with real users using it. We therefore need to understand how the production environments work, what monitoring is in place and how our software is deployed to production. By understanding the existing processes for monitoring and maintaining software in production, we can ensure that we deliver value to our customers and avoid problems at what is traditionally the most stressful time of a project”
To me, this is all about operability.
Operability is defined in Wikipedia as “the ability to keep a system in a safe and reliable functioning condition, according to predefined operational requirements”. I describe it to our clients as “the operational requirements we deliver to ensure our software runs in production as desired”.
Operability has been around since Jesse Robbins wrote Operations is a Competitive Advantage in 2007, but the biggest influence on my thinking has been Andrew Shafer’s talk “What even is Operable?” at Operability 2015:
Andrew spoke about the need to accept that working at scale means always working in a state of partial failure, and having an architecture that’s self-aware of its own health is key to easing your operational burden. I couldn’t agree more with this view.
When I speak with our clients, I explain that operability is about more than just the infrastructure automation they’ve been led to expect from “DevOps Engineers”. Operability is about a number of different techniques that help us to run software in production, in both normal and abnormal conditions:
- Requirements – ensuring operational requirements for configuration, infrastructure, security, etc. are planned and prioritised in the same way as functional requirements
- Infrastructure – automating pre-production and production environments, whether for a single application or for multiple applications running on a Platform as a Service
- Telemetry – creating software that can automatically emit operational/business logs and metrics, plus a platform for logging, monitoring, anomaly detection, and alerting
- Deployment Health – adding automated smoke tests of system health into pre-production and production deployments of applications
- Shared On-Call – asking all team members to go on rotation for production incidents, and ensuring they are able to resolve those incidents
- Architecture – restricting failure to a single service, and applying backpressure to regulate demand when a service is under load
- Post-Mortems – holding blameless post-incident reviews to understand the causes of an incident, and implement preventative measures for next time
All Equal Experts DevOps consultants have a lot of expertise in the above. However, the terms ‘DevOps’ and ‘DevOps Engineer’ are usually associated with automated infrastructure only. So when our clients ask us for help with DevOps, they’re always happy to see us go beyond automated infrastructure and create reliable production systems.
The value of operability
At Equal Experts, we see operability as a key enabler of Continuous Delivery.
Continuous Delivery is all about improving your time to market. As Andrew Shafer said in his talk above, operability is about designing production systems to handle a constant state of partial failure. It’s not much use creating a super-fast deployment pipeline if your production systems cannot be operated safely and reliably – you’ll spend most of your time handling outages, instead of delivering new features for your customers.
This means operational excellence is just as important as a deployment pipeline, automated acceptance tests, cross-functional delivery teams, and so on.
In that spirit, I’m changing my role at Equal Experts. I’m going to move from DevOps Lead to Operability Lead, to emphasise our desire to look beyond DevOps to the delivery of operational requirements to our clients.
As part of this initiative, I’ll be talking with our DevOps community of practice to shift its focus to an operability community of practice, and I’ll be working with our clients to help them understand how operability can help them respond to changing business needs.
This is the second article in our multi-author series “Beyond DevOps”, which aims to explore DevOps and Continuous Delivery – and how they affect our culture and work. In the next part, we’ll be looking at our vision for Continuous Delivery and operability. Keep an eye on the blog or follow us on Twitter for the latest updates.