Alex Yak

Software Engineer
AI

May 11, 2026

AI-driven engineering: A complete prompt workflow to deliver production-ready software

Can AI be used to build production-grade applications, reducing the engineering effort of a full team over several months down to a single engineer over a few weeks?

This is the question I’ve been exploring while working with a client in Australia to help them transform a “vibe-coded” prototype into a production-ready product as quickly and effectively as possible. With tech leaders everywhere under pressure to “do more with less”, the promise of increased efficiency with AI is tempting, but it can also carry risks of increased tech debt and unmaintainable, insecure code.

After a complete audit of the app’s initial AI-generated code quickly revealed a long list of things that needed to be fixed before it could be launched, I wanted to see if we could lean into AI to solve some of these problems. The goal was to create a repeatable approach to take an AI prototype through to a production app, while reducing the time and cost of the end-to-end process.

Auditing the vibe-coded prototype

With a complete AI prototype already built in Replit, we first had to decide whether to harden the prototype iteratively or rewrite it completely from scratch. Our investigations revealed that hardening the prototype may have been possible, but with foundational issues across security, reliability and data governance, there were too many blockers hindering a quick path to production.

Instead, it was determined that a fresh foundation would provide better results. By starting over with a structured process, we could define a clearer MVP, choosing to intentionally leave out features to prioritise architectural integrity and a faster, safer path to production.

The solution: A prescriptive workflow approach

Taking a spec-driven development approach to AI engineering, the idea was to create a complete workflow across discovery, design, planning, and build phases. Instead of jumping straight into development or relying on ad-hoc prompting, which can quickly devolve into messy and inconsistent code, each phase included multiple steps and specific prompt instructions for the AI to generate an artefact that can be passed on to the next step.

Each step is also immediately reviewed using another prompt or series of prompts to assess the generated artefacts from different perspectives. As each step generates clear artefacts, human review can be inserted at any point, with an expert able to ensure the architecture, design, plan, or even implementation is going in the right direction.

The workflow follows a complete development process which creates requirements first, then architecture, then planning and build phases, and also maintenance phases, backlog and issue organisation, and packaging all work into increments consisting of epics consisting of tasks, and allowing for ad-hoc out-of-increment development for small tasks or issues, using a similar prompting approach.

The complete process is illustrated in the diagram below, but it is continuously evolving.

AI-driven architecture and implementation workflow. Stages include discover, design, plan and build.

 

How the system works together

Simply adding Claude Code to your development process might not help a great deal or may even create new problems. For example, increasing code output by 10x without increasing code reviews or improving your testing posture will not result in success.

The approach is designed so that each part complements another. If one prompt creates a specific problem, another prompt is engineered to solve it. Prompts create artefacts that are then reviewed by other prompts, with missing items collected and saved to the backlog by another set of prompts. Outputs can be verified at any time, with opportunities for manual checks throughout. It might seem like a lot, but every step has a carefully thought-out purpose.

While some AI frameworks give you a toolkit for better prompts or provide more scaffolding to processes, this workflow is more prescriptive. Each step exists in order to solve a specific problem with AI-driven development and creates a repeatable process that can be used or modified to achieve repeatable results.

Key insights

Through the development of this AI-driven approach system, several key insights emerged. These are my own observations based on the work done and are open to scrutiny.

Prompt engineering and tackling agent tendencies

The workflow doesn’t try to include security, usability, and performance in a single prompt. Instead, prompts must be carefully crafted to do exactly one thing from one perspective to gain better results.

Prompts must also be engineered to correct for agent tendencies. For example, Claude Opus often treated everything as an MVP from the outset, favouring short-term implementation ease over long-term stability. In one instance, the AI decided that in-memory aggregation of data was good enough for now, despite the fact that the solution would fail after a few months of data. Prompts have to be specifically engineered to disallow this MVP mentality if it is not desired.

Good engineering is now more important, not less

The things that made a good engineering process pre-AI are still applicable and may even be more important now. Specialist code review, CI/CD and automated testing are now essential if you’re using AI to write, refactor or review your code.

AI agents are good at receiving feedback from tools and fixing the code until it passes, but they can also remove necessary code or tests just to make a process work. Techniques like starting each step with a fresh context window can help, but a human expert is still necessary in addition to the tooling and practices.

The necessity of a human expert

While it is possible that an AI system today could follow a workflow autonomously and come out with a clean result, without critical security, performance, or usability issues, this is not guaranteed. In the development of this workflow, 90% of the time the AI completed tasks well, but the 10% of times it failed, the results were severe, with left stub tests or incomplete features. With the system designed to include review prompts, many of these issues were caught and automatically fixed by the AI, but it still repeatedly made poor decisions after review points.

The human expert is critical to the speed of implementation. When working with familiar technologies, an expert can drive the AI at maximum speed, with minimal intervention, as they can clearly review and verify the results. But when working with unfamiliar stacks, it becomes harder to trust the results and confirm some of the subtle decisions made by the AI. Just as an expert may have caught an error made by a junior developer in the past, experts now need to check the AI for any issues that could otherwise compound over time.

Trusting the AI output

There is a common lack of trust in AI-generated output. However, traditional engineering teams can also produce poorly structured code and bugs that slip through manual reviews. Generative AI is excellent at replicating industry-standard patterns, and while it may not be best-in-class compared to an expert human engineer, with the right prompts in the right system, it can enable teams to go faster with comparable results. The workflow approach takes trust into account, with multiple review prompts and opportunities for human review as well.

Maintaining codebase cohesion

As a codebase grows, it becomes particularly important to maintain the codebase and keep it cohesive. A messy codebase is not an easy codebase for AI to work in. It will duplicate patterns that exist in your codebase, and if those patterns are not that good, it will introduce more of the same. In this workflow, specific prompts ensure every new feature conforms to the rest of the codebase and avoids duplicating code.

Adapting to your risk profile

Much of the AI workflow also depends on the risk profile of the individual and the organisation, with human reviews scaled up or skipped entirely depending on specific circumstances. High-sensitivity applications may require more human review at every stage, while low-risk applications may allow for more autonomous AI work. Regardless of risk profile, following a specific workflow is better than ad-hoc or one-shot prompting to maximise success.

Conclusion

My overall perspective is to approach AI-driven development practices as problems to be solved and then solve them appropriately for your use case. If the problem is that vibe-coding and one-shot prompting don’t get you consistent results at scale, then bake more prompt scaffolding and workflow into your process.

Shifting the focus from writing code to managing a high-fidelity generation process, with specific prompts, prescriptive workflows and regular reviews by human experts, engineering leaders can start to reduce time-to-production while maintaining the standards required for enterprise software.

About the author

Alex Yak is a Tech Lead based in Australia with over 15 years of software engineering experience across large enterprises, scaleups, and startups. Comfortable across both frontend and backend, Alex is a sharp problem-solver who builds systems that solve real problems, improving product quality, user experience, and the way engineering teams work. AI is increasingly part of that toolkit – applied with the same commitment to quality that underpins the rest of the craft.

You may also like

Blog

AI security is not a tooling problem. It’s an adoption risk

Agile development and DevOps

Blog

The ‘Loop of Loops’: rethinking Product Discovery in the AI era

Blog

Using AI to speed up maintenance work

Get in touch

Solving a complex business problem? You need experts by your side.

All business models have their pros and cons. But, when you consider the type of problems we help our clients to solve at Equal Experts, it’s worth thinking about the level of experience and the best consultancy approach to solve them.

 

If you’d like to find out more about working with us – get in touch. We’d love to hear from you.