Thorben Louw

Data/Machine Learning Engineer
AI

October 27, 2025

Accelerating Architectural Decision Records (ADRs) with Generative AI

In any complex, long-lived software project, for example those involving evolving client platforms, two things are constant: change and discussion. Decisions get made in meetings, over chat, or during pairing sessions. But if those decisions aren’t documented, they become implicit knowledge, which can be lost when people move on. This is a recipe for wasted cycles, rework, and inefficiency when new team members join or when future initiatives want to extend or migrate implementations.

Architectural Decision Records (ADRs) are a lightweight, effective and popular way to capture these choices as they happen. They fill the critical gap between high-level architectural documents (which don’t contain this level of detail) and low-level designs (which are often out-of-date with the actual implementation, and at the wrong level of abstraction).

On recent large-scale platform projects where we’ve advocated for the use of ADRs, we’ve been asked to retroactively document dozens of critical architectural choices that had never been formally recorded. Generating these ADRs quickly, while keeping them valuable and accurate, would once have been a daunting task.

We chose to treat this as a challenge: how could we leverage Generative AI to bring discipline and speed to a manual, time-consuming process, producing outputs that are as high-quality as if we had written every word ourselves? And how well does this approach scale?

Models and tools

Our approach deliberately avoided expensive, specialised tools or complicated integrations. We used widely accessible Large Language Models (LLMs) – specifically Google’s Gemini (2.5 Pro and Flash) and OpenAI’s ChatGPT via their standard web interfaces. Both are commonly approved for client use (often alongside Copilot).

This ‘low tool barrier to entry’ allowed us to focus on the core technique, rather than any specific model or tool. We engineered the prompts to enforce structure, tone, and technical rigour, while ensuring a wide variety of stakeholders could participate in the process.

Structuring decisions with GenAI

ADRs are meant to be atomic (recording a granular decision) and immutable (never edited once published. New ADRs supersede old ones). ADRs form a clear trail of decision-making (“when did we decide to use XYZ and why?”) that outlives the original project team.
The beauty of the ADR format lies in its simplicity. They’re often just lightweight text documents in a repository or wiki.

What do ADRs look like?

ADRs don’t have a fixed structure and conventions vary, but they typically include some or all of the following sections:

Generating ADRs from a short decision statement or “kernel of truth”

Although ADRs are lightweight, much of their content is still structure, convention, and background for those readers without implicit context about the decision. However, the kernel of the decision is often just a short sentence or two, like:

“On 22 April we decided to use DBT (Data Build Tool) as our transformation framework because it’s SQL-based, open-source, and already works with Databricks”.

GenAI excels at taking this “kernel of truth” along with some project context, and generating the surrounding narrative, structure, and suggested decision context for human review.

We’ve found it useful to start by focusing on these “kernel of truth” decision statements without getting bogged down in a complete ADR’s structure and details right away. We can rapidly build up a list of “one-liners” that will become potential ADRs by using AI tools to inspect code and existing project documentation, and through conversations with stakeholders. For each of these one-liners, we can try to reconstruct some of the decision context, like when it was made, by looking at code or documentation changes (e.g. the date at which a library was introduced, and by whom).

The Metaprompt

With our collection of one-line “kernel of truth” statements ready, we can then feed them into a process using AI tooling to quickly generate strong first drafts of ADRs.

We used an approach of metaprompting – designing an LLM prompt which acts as a template for creating a bespoke, decision-specific prompt for each decision – that can be then used with a fresh “LLM chat context” to generate the detailed ADR document.

The prompt instructs an LLM to produce an ADR based on a preferred structure, style, and tone (e.g., professional, succinct, UK English spellings).

We also embed rules directly into the prompt to ensure consistency and quality:

  • ADRs must have a unique identifier in a form like “ADR-ABC-0028”.
  • ADRs should have a short, human-readable title that summarises the decision.
  • The ADR should follow this structure:

Markdown

#ADR-XYZ-<0029> <The title here>
## Background
Describe the context for the decision. Why did it need to be made? What was the state before the decision?
## Alternatives
Think HARD to consider good, applicable alternatives. Search the web for the latest
## Decision
Call out the actual decision here

These templates are easily customised, so they shouldn’t be one-size-fits-all or ignore a client’s standards and ways of working. Instead, we collaborate with each client to refine the prompt. This ensures the client is left with a reusable tool that works effectively within their organisation.

Higher-level uses

GenAI tools can do far more than just “fill in the form”. We found several valuable higher-level applications:

  • Inferring possible ADRs: By pasting in existing design artefacts, such as high level designs, solution documents, slide packs or code, we can prompt the AI to suggest ADRs that represent implicit decisions that should be made explicit. This helps us build an initial backlog of missed decisions. The result was a “to-do” list of one-line ideas, which we reviewed and used as kernels for the main ADR generator prompt.
  • Reviewing and quality-checking ADRs: With review-geared prompts, we applied quality checks to both AI-generated and human-written ADRs.
    – For example, does the ADR match structure, are arguments cogent, is the tone appropriate?
  • Detecting inconsistencies and contradictions between ADRs: We also experimented with using a coding agent to review all project ADRs collectively to find inconsistencies and contradictions, though success was variable.
  • Suggesting reviewers: Ideally, design acceptance should be a formality because relevant stakeholders have been engaged since the start. To reinforce this, our prompt also suggests appropriate reviewers based on the ADR content. For example, highlighting when security, networking or data privacy roles might need to be involved.

What went well

The biggest win was speed, and the velocity of retrospective documentation. Once the metaprompt is refined and the decision statements collected, we might generate dozens of ADRs in a single morning. Manually writing each one and inventing lists of alternatives would have been far slower and far duller.

This freed us to focus on the main ideas and arguments, while the LLM handled the surrounding narrative and conventions.

The low tool barrier to entry (just a web chat interface) made it easy to share the technique across teams.

What didn’t go so well

ADRs are important design artefacts. If they’re wrong, they quickly become noise and can mislead future team members.

Our generated ADRs can have plenty of flaws, reinforcing that AI output is only useful as a first draft:

  • Hallucinated facts and references – The AI tools frequently hallucinated reference material, including non-existent APIs, web pages, or entire product features.
  • Mismatched justifications – At times, the suggested justifications didn’t fit the real decision context, often due to vague prompts or the model ignoring instructions.

We can mitigate this by telling modern models to “think hard” and “search the web for the latest”. We also add guardrails like:

  • “References MUST exist – check each reference to ensure that the link is valid.”
  • “DO NOT make up product features that don’t exist, think hard to ensure that any facts you refer to are in the actual product.”

These guardrails improve accuracy but still won’t be foolproof. The key lesson is to check facts, check every reference link, and confirm that each ADR truly reflects what you meant to say – human review remains key.

Under time pressure, it can be difficult to review many documents without falling into the trap where the clear, logical, and convincing LLM output sounds so credible that it discourages proper scrutiny. A useful tool we’ve used here is to utilise a different LLM prompt as a “judge” to critique generated ADRs for logical flaws, before final human review.

For us at Equal Experts, it’s also vital that clients know when AI is being used for tasks such as these. We’re transparent about what we’re doing, and aren’t secretly using AI tools to mimic human work, or generating “AI slop”. Building and maintaining this trust is vital. It’s not just about doing this work quickly. It’s about the value of building an accurate picture of project decisions and architecture that can be interrogated by humans and AI coding tools. Without AI tools, this can be infeasible at scale.

Our experiences also reinforced that the writing of the ADRs is only a small part of the overall process. Easing the friction of producing documentation means the bottleneck in progress becomes the other parts of existing governance processes that have inefficiencies (such as finding time with key stakeholders, or getting through heavyweight review forums). This experience is positive – it opens the way to useful discussions about optimising governance processes.

Conclusion and call-to-action

Using AI tools to quickly generate retrospective ADRs has proved incredibly valuable on some recent projects, and made the documentation side of delivery less tedious. It allowed our practitioners to focus on technical arguments and human review, rather than repetitive form-filling.

We believe that combining engineering discipline with the practical use of AI tools is essential for modern software delivery.

If you’re wrestling with implicit knowledge or a backlog of undocumented decisions, we encourage you to try this approach. Start with a one-line decision, craft a rigorous metaprompt, and see how fast you can turn a “kernel of truth” into a high-quality draft.

We’d love to hear about your own experiments. Share your learnings or get in touch with Equal Experts to compare notes on how you’re balancing speed and discipline in your own delivery practices.

Disclaimer

This blog post reflects the personal practices and opinions of the individual authors and is based on their experiences in various client engagements. It does not represent an Equal Experts-wide methodology or endorsement. Equal Experts is not affiliated with any specific tool or model mentioned (Gemini, ChatGPT, DBT, etc.).

About the authors

Thorben Louw is a Lead Consultant with EqualExperts. He is a data and software engineering specialist with over 20 years’ experience delivering scalable, pragmatic data and machine learning products across cloud platforms, and designing cloud data platforms in a variety of domains. He helps teams adopt modern data practices to rapidly build, test, and deliver value from their data.

Reda Hmeid is a Principal Consultant at Equal Experts and a trusted technology strategist and advisor to C-suite executives, architects and engineers alike. With over 26 years of experience, he helps organisations improve how technology decisions are made — shifting what “good” looks like from an architecture perspective further left in the delivery lifecycle. Reda now focuses on how AI aligned with technical knowledge and good practice can accelerate this shift, making life better for architects and engineers alike by enabling smarter, faster, and more informed decisions.

Paul Brabban is a lead consultant with Equal Experts, bringing over 24 years of experience in demanding engineering and leadership roles. He specialises in solving complex data-intensive problems at scale with lean, cost-effective methods and has a relentless focus on value. Paul’s experience covers six-person startups to multinationals in multiple industries including retail and financial services. He provides technical leadership on data strategy and execution, engaging with stakeholders up to director and C-suite level. Alongside Equal Experts, he shares his experience at tempered.works.

You may also like

Anyone Can Usability Test, Part 3: Making the Most of Your Findings

Blog

Anyone Can Usability Test, Part 3: Making the Most of Your Findings

Do we need roles in a cross functional team?

Blog

Do we need roles in a cross functional team?

Epic anchors – bringing epic stability to cross-functional teams

Blog

Epic anchors – bringing epic stability to cross-functional teams

Get in touch

Solving a complex business problem? You need experts by your side.

All business models have their pros and cons. But, when you consider the type of problems we help our clients to solve at Equal Experts, it’s worth thinking about the level of experience and the best consultancy approach to solve them.

 

If you’d like to find out more about working with us – get in touch. We’d love to hear from you.