Designing with AI agents
Specs, review, and the human in the loop — directing agents at the scale of whole systems.
- Scope work into specs an agent can execute and you can verify
- Run a review loop that keeps architectural judgement human
- Decide what should never be delegated
The beginner lesson framed an agent as a fast, overconfident junior and taught the verify, don't trust habit. This one scales that up: how do you direct agents across a real system, where the work is bigger than a single function and the decisions carry architectural weight? The fundamentals you've built — decomposition, interfaces, architecture, testing — are exactly the tools.
Write a spec an agent can follow
Agents succeed in proportion to how well the work is scoped. A good spec, for an agent or a human, states:
- The goal — what should be true when this is done.
- The constraints — what to use, what to avoid, what conventions to match.
- The contract — the inputs and outputs of the piece (interfaces again).
- The definition of done — how success will be checked, ideally as tests.
This is decomposition wearing a different hat: you carve the system into pieces with clear contracts, and each piece becomes a spec an agent can execute.
Small, verifiable steps
Resist handing over a giant, vague goal. Break it into steps small enough that each produces a change you can actually read and check. A reviewable diff is a safe diff; a thousand-line diff hides bugs no matter who wrote it. Smaller steps also let you catch a wrong direction early, before it compounds.
The review loop
The rhythm that works:
- Specify a small, well-scoped piece.
- Let the agent implement it.
- Review — read the code, check the interfaces, run the tests, try an edge case (everything from the testing and reading-code lessons).
- Correct or accept, then move to the next piece.
Speed comes from the agent; correctness comes from the loop. The moment you stop reviewing, you stop understanding your own system — and you'll discover that at the worst possible time.
Keep judgement human
Some things you don't delegate:
- Architecture — the boundaries and trade-offs that are expensive to reverse.
- What "correct" means — the spec and the tests that define success.
- The final accountability — an agent has no stake in the outcome; you do.
Delegate the typing and much of the implementation; keep the judgement.
Agents raise the ceiling for people with strong fundamentals and lower the floor for those without — because directing and reviewing them is the fundamentals, applied. That's the whole bet of this camp.
Where to go next
When something an agent (or you) wrote misbehaves, you need a method. Next: debugging as a discipline.