My AI-Driven Development Workflow

2026/03/24

(No AI was used in the process of writing this article.)

I’ve grown to like my current development workflow with AI, enough that I thought I’d share it. This may all become ancient history by next month, but at least it will be a faithful snapshot of what worked well in the era of Opus 4.6.

The Workflow

Agent Loop

I have a (vibe-coded) Python loop running on a cloud VM that continuously polls a list of “resources” including Slack, Gitlab, and local git repo status to detect new content (e.g. messages, notes, dirty local state) since the last poll. When changes are detected, different types of resources are “linked” together via BFS to form a “work item”. For example, a work item might have a GitLab issue, plus a slack thread that created it, and PRs associated with it. The loop then invokes an LLM agent called the “coordinator” to process the work item. The coordinator’s primary job is to triage - decide an appropriate subagent to handle the work item. Its decision is fed back to the outer Python loop, which invokes that subagent to do the work.

For work items that require code changes (as opposed to e.g. simple Q&A), there is a predefined development lifecycle described by agent prompts. The coordinator first creates a GitLab issue for the requested change, then a “planner” agent plans, addresses feedback, and re-plans until approved, then an “implementer” agent implements the plan and creates a PR, and finally a “shepherd” agent addresses CI failures and review comments - including both human reviews and AI reviews - until the PR is approved, and merges the PR.

The human interacts with agents through Slack and Gitlab. Usually I would start the workflow from a slack message, then review plans from Slack (since plans are nice enough to read in Slack), then review code changes from Gitlab.

The loop runs during working hours only, just to help me keep my sanity after work.

Backend

We have a mono-repo setup built on top of regular git repos, so each work item gets its own “super-worktree”, internally consisting of worktrees of individual git repos. This also serves as a standalone bazel workspace. Once the agent invocation completes for a work item, we update local state to record all resources associated with the work item as done. Each work item is single threaded, but different work items can be dispatched in parallel, up to a configurable level of concurrency. I only have it set to 2 for now, which already seems like plenty since my day job isn’t about building vast greenfield projects.

Different agent types can be configured to use different LLM backends - e.g. Claude Code with Opus for planner, Cursor CLI with GPT 5.4 for shepherd, Pi with Kimi 2.5 for implementer, etc. This is a crucial cost-saving measure and allows you to pick the exact intelligence level needed for a task. I might stretch this further to allow the coordinator to decide subagent model and backend type dynamically, based on user request. Backends also do usage/cost accounting and reports usage stats periodically.

Summary

That’s it! The entire thing is maybe 5000 lines of Python plus command line tools like glab. No uv or pip, just stdlib. No databases either, just plain JSON files. The aforementioned local state is mostly just “cursors” recording the latest completed timestamp of each resource.

Discussion

This is effectively the traditional agile-like SDLC integrated with agents. You start with a ticket, pick it up, create a small plan, implement it, then get it reviewed and shipped. It works well for me for the following reasons:

As a result of this workflow, I have now abandoned almost all JIRA board equivalents for my small team. Every requirement and idea that comes my way goes straight to the agent DM, which helps with triage. Issues assigned to the agent is the new team JIRA board. Simple changes are handled directly by the agent. More intricate ones sit there until a human has time to pick it up.

Of course this hasn’t replaced the need for humans - humans plan milestones and roadmaps, and take on larger scope projects that require more context, direction, and cross-team collaboration. They also guide agents on changes and review their (frequently still subpar) output. Agents handle the grunt work.

Why not use OpenClaw? Several reasons:

The agent loop is currently single-user-only, acting like a personal developer team. The next step beyond this is probably more collaboration among agents and other developers - have the agent talk to humans to seek approval in public Slack channels, allow humans to assign tasks to other people’s agents, make agents propose new work autonomously overnight, etc. I’m sure other companies are already experimenting further, so this is only one part of the solution.