For a client, we were recently looking at a tangible way to introduce AI into the team to provide benefits such as increased speed of delivery. We were building out a large data platform, enabling self-serve capabilities for data product teams.
The idea
Inspired by a colleague’s blog, we liked the idea of working collaboratively with our AI agent to help us deploy the infrastructure for our solution. We started with 3 key ideas:
- How can we ensure that our AI agent understands the problem statement we’re trying to solve?
- Could we rationalise our thinking with the agent to ensure we’re aligned on our plan of implementation? Were there alternatives we should consider?
- How could we do all of this whilst gaining the benefits of pairing?
Who else better to ask other than our AI agent?
We worked with Copilot and asked what information it felt it needed, provided as much context as we could, broke down the problem statement into small steps and Copilot created a folder structure for us that looked roughly like so:
View the code
root_folder
-- ai-collaboration
context-loader.MD
ai-session-log.MD
-- architecture
-- architecture.MD
-- decision-records
-- guidelines
-- guidelines.MD
-- implementation-plans
-- implementation-plan-1.MD
-- problem-statements
-- problem-statement.MD
-- rules
-- rules.MD
-- ai-collaboration-protocol.MD
We switched drivers and prompted Copilot to read the context-loader.MD file. This markdown file contained instructions to read all of the files in the other directories to ensure Copilot was in the same headspace we were.
Copilot listed the major steps we’d broken down and asked if we wanted to proceed to step 2.
- Step 1: Define problem statement and break it down into smaller steps, define AI collaboration protocol and enable AI pairing between team members
- Step 2: Deploy the infrastructure required to serve the needs of the problem statement
- Step 3: Verify our infrastructure was operating as expected
- Step 4: Get end user to test they could use the interface provided by our infrastructure
Note: Our rules included breaking work down into smaller steps and deploying a single resource at a time. This ensured us mere mortals could understand and easily review what was going on and manage our cognitive load.
Iterations with AI
In step 2 we agreed an architecture with Copilot and so it provided the order in which to implement the resources.
At this point, we went through a few iterations of deployments (running terraform locally) and pairing sessions and hit a few teething issues. These ranged from using old terraform provider versions, generating overly verbose markdown files, manually loading in our rules for each new chat and hallucinations/going off our desired path.
We resolved all of these with new rules and a copilot instruction file.
A non-negotiable we developed was that after every 4 resource deployments we created a new chat with Copilot so that it could start with a clear context window.
This discipline became essential to our success and rapid development going forward, else Copilot inherently pushed us towards what felt like vibe coding and a lack of proper engineering practices.
These improvements resulted in the following configuration:
View the code
root_folder
-- .github
-- Copilot-instructions.MD
-- ai-collaboration
-- ai-collaboration-protocol.MD
-- ai-session-log.MD
-- ai-context.MD
These files were drastically condensed in comparison to their initial counterparts whilst still providing all the key information, ranging from problem statement to implementation plan through to decision records.
In a flow
We got into a flow with Copilot. Define resource -> Deploy resource -> Resolve any issues & redeploy if necessary -> Define next resource, etc.
We did this for up to a maximum of 4 resources in a single chat session, at which point we’d update the ai session log and create a new chat session.
This was a super quick feedback loop. Copilot was suggesting solutions before the error message had finished printing in our console. However, we believed it to be important to keep the human in the loop so ensured we always understood the issue and solution suggested by Copilot before proceeding.
With this rapid feedback loop, what we’d given ourselves 2 weeks to deploy for an MVP (based on a rough estimate from previous similar undertakings), we deployed and tested in 4 days.
Our key learnings
- Pairing whilst treating AI as a 3rd person in the group is not just possible, but effective.
- Creating a fast feedback loop when implementing and executing small units of code along with Copilot is super powerful and can provide a rapid development mechanism, enabling the human to stay in the loop the whole time.
- Claude sonnet 3.5 has a small context window. Keeping changes small really helps with not filling this up and benefits the understanding for the human in the loop. Too much documentation also fills this context window up; make sure only the necessary information is loaded into the agent’s context.
- Discipline in starting a new interaction with Copilot when the context window appears to fill up significantly reduces the likelihood of AI hallucinating / going off the desired path. Many tools now provide an indication of how full the context window is and/or can be prompted for this information which can help with this.
- Keeping a human in the loop was valuable to us as we gained confidence in the code and our process and were able to demo our code and configuration to the wider team.
Ultimately, what we’ve done here isn’t groundbreaking. We used common software engineer principles and equipped AI to accelerate our development. There’s a whole host more that could be improved here, such as reducing the information that’s loaded into the context window, reducing the tokens used by the AI to help reduce spend and increase speed, etc.
What we hope this provides is a tangible example that others could learn from and/or provide an accessible way for people to dip their toes into the AI world.
Dom is a Technical Consultant at Equal Experts with over 10 years of experience in platform engineering and cloud architecture. He specialises in designing and building scalable, resilient systems and teams that prioritise and deliver key business value. Dom’s current focus is utilising generative AI in data platforms, with a focus on solving real business problems with measurable outcomes for accelerated delivery whilst keeping high-performing teams empowered.