Why developers doing maintenance mode is best for your business
I speak with customers and consultants across the Equal Experts network, to help our customers solve their scaling problems and achieve business agility. One customer recently asked me ‘how do we do maintenance mode in a DevSecOps world’ and our answer of ‘create multi-product teams’ deserves an explanation.
Maintenance mode is when demand for change declines to zero for live digital services and data pipelines, and unplanned maintenance work supersedes planned value-adding work. Our customers also call this ‘keeping the lights on’, ‘BAU support’, or ‘evergreening’.
Zero demand doesn’t mean zero user needs, zero planned features, or zero faults. Software can’t entirely satisfy users, can’t be finished, and as Dr. Richard Cook explained it can’t be fault-free. Zero demand means zero funding for more feature delivery, at least for now. Zero demand services go into maintenance mode to preserve availability targets, while minimising operational costs. This includes upgrading libraries, applying security patches, and fixing faults.
Our customers often feel dedicated delivery teams are too expensive for maintenance mode. They want to resize teams to reduce costs and/or reassign people to unlock capacity for new propositions. But that’s not enough. They also need to protect technical quality and reliability, staff retention, and the option of future feature delivery. At Equal Experts, we refer to this as the Maintenance Mode Problem.
Here’s a comparison of maintenance mode solutions. They all improve team costs and capacity, but their other impacts are very different.
Solution #1 – your operations team
The traditional maintenance mode solution is your operations team. Zero demand services are transitioned from delivery teams into an application support team, which is staffed with operations analysts. That team is accountable for all maintenance work and reliability outcomes. Here’s an example from a composite European retailer.
Using an operations team for maintenance mode does improve run rates and capacity. However, there’s a significant risk of a maintenance dumping ground, with live services lacking any business owners. There’s a negative impact on quality and reliability, staff retention, and future feature delivery. This is caused by:
- High cognitive load. The number of services that can be in maintenance mode is unconstrained, and tasks can become very demanding. Operations analysts have to keep tens or hundreds of different live services in their working memory
- Missing domain knowledge and/or technical skills. Operations analysts might not have the ability to rapidly complete BAU maintenance tasks e.g. updating automated functional tests after a breaking change in a library upgrade
- Lack of intrinsic team motivation. The emphasis on outputs over outcomes creates a perception of a dumping ground. Operations analysts will feel their team doesn’t have a purpose, and become unhappy
- Difficult reverse service transition. It’s painful to transition a service from a delivery team into an operations team at the start of maintenance mode. If funding is allocated for more feature delivery, the reverse transition back into a delivery team is much harder
I know a car repair company with an operations team running ePOS software in maintenance mode. Team members sometimes lack the technical skills for library upgrades, and it impacts ePOS performance. When new regulations are announced, there’s a reverse service transition into a delivery team. When their functional changes are complete, there’s another transition back into the operations team. It’s a time-consuming, costly process.
Solution #2 – developer support teams
Another solution is to form developer support teams. All zero demand services in a single affinity are transferred from their delivery teams into support teams staffed by developers. We usually see this done according to company locations. Here’s the same retailer example, with developer support teams for Berlin and Madrid offices.
Developer support teams have some advantages over an operations team. Developers have the technical skills to rapidly complete maintenance tasks. Transferring a live service between delivery teams is easier than a service transition with an operations team, as their shared culture allows for a handover focussed on domain knowledge. But there’s still a negative impact on quality and reliability, plus staff retention:
- High cognitive load. The number of services that can be in maintenance mode is weakly constrained. Developers may have to know tens or hundreds of live services
- Lack of intrinsic team motivation. Developers may feel they are working in a unrewarding dumping ground of live services, and seek to leave
One public sector organisation has four developer support teams spread across its four delivery centres. One of those teams has to complete maintenance tasks across ~150 microservices, from ~50 live services built by ~20 delivery teams. This has had an inevitable impact on lead time and failure recovery time, despite individual heroics. All four developer support teams have a higher churn rate than their delivery team peers.
Solution #3 – multi-product teams
Our recommended maintenance mode solution is multi-product teams. It’s a logical extension to our preferred You Build It You Run It operating model, and it follows the same principle of outcome-oriented, empowered product teams. All zero demand services in a product vertical are transferred from their product teams into a multi-product team, staffed by developers. Here’s the retailer example again, with multi-product teams in two verticals of multiple domains.
Multi-product teams have advantages over developer support teams and an operations team. The You Build It You Run It operating model ensures a multi-product team has the necessary technical skills, operational incentives, and intrinsic motivation to succeed. It’s easy to transfer live services between a product team and a multi-product team, as there’s a shared ethos of build and run. In addition, cognitive load is limited to the number of services in a vertical, and if it becomes too much a multi-product team can divide itself by domains.
Multi-product teams need guardrails, to counter any dumping ground preconceptions lurking in your organisation. We suggest:
- Define zero demand. Describe it as a non-differentiating service with 3+ months of live user traffic, where the product manager has declared no more funding exists
- Create identity and purpose. Give a multi-product team the same name as its product vertical, to emphasise the team mission and focus on outcomes over outputs
- Document transfer criteria. Ensure the same criteria are used for transferring a live service between two product teams, or a product team and a multi-product team
Conclusion
In a DevSecOps world, you still need a maintenance mode solution. The Maintenance Mode Problem is inevitable. Your non-differentiating digital services and data pipelines can reach zero demand, and that’s OK. Just avoid the traditional maintenance mode solution of using your operations team. It’ll harm your failure recovery time, vulnerability resolution time, staff happiness, and future feature delivery. Instead, create multi-product teams tied to product verticals, and ensure your developers are empowered to protect customer outcomes.