Designing with AI agents

Specs, review, and the human in the loop — directing agents at the scale of whole systems.

The beginner lesson framed an agent as a fast, overconfident junior and taught the verify, don't trust habit. This one scales that up: how do you direct agents across a real system, where the work is bigger than a single function and the decisions carry architectural weight? The fundamentals you've built — decomposition, interfaces, architecture, testing — are exactly the tools.

Write a spec an agent can follow

Agents succeed in proportion to how well the work is scoped. A good spec, for an agent or a human, states:

The goal — what should be true when this is done.
The constraints — what to use, what to avoid, what conventions to match.
The contract — the inputs and outputs of the piece (interfaces again).
The definition of done — how success will be checked, ideally as tests.

This is decomposition wearing a different hat: you carve the system into pieces with clear contracts, and each piece becomes a spec an agent can execute.

Small, verifiable steps

Resist handing over a giant, vague goal. Break it into steps small enough that each produces a change you can actually read and check. A reviewable diff is a safe diff; a thousand-line diff hides bugs no matter who wrote it. Smaller steps also let you catch a wrong direction early, before it compounds.

The review loop

The rhythm that works:

Specify a small, well-scoped piece.
Let the agent implement it.
Review — read the code, check the interfaces, run the tests, try an edge case (everything from the testing and reading-code lessons).
Correct or accept, then move to the next piece.

Speed comes from the agent; correctness comes from the loop. The moment you stop reviewing, you stop understanding your own system — and you'll discover that at the worst possible time.

Keep judgement human

Some things you don't delegate:

Architecture — the boundaries and trade-offs that are expensive to reverse.
What "correct" means — the spec and the tests that define success.
The final accountability — an agent has no stake in the outcome; you do.

Delegate the typing and much of the implementation; keep the judgement.

Agents raise the ceiling for people with strong fundamentals and lower the floor for those without — because directing and reviewing them is the fundamentals, applied. That's the whole bet of this camp.

Where to go next

When something an agent (or you) wrote misbehaves, you need a method. Next: debugging as a discipline.

Finished reading? Mark it complete to track your progress.

Write a spec an agent can follow

Small, verifiable steps

The review loop

Keep judgement human

Where to go next

On this page