Code of the Day
IntermediateApplied Systems

Designing with AI agents

Specs, review, and the human in the loop — directing agents at the scale of whole systems.

FundamentalsIntermediate10 min read
By the end of this lesson you will be able to:
  • Scope work into specs an agent can execute and you can verify
  • Run a review loop that keeps architectural judgement human
  • Decide what should never be delegated

The beginner lesson framed an agent as a fast, overconfident junior and taught the verify, don't trust habit. This one scales that up: how do you direct agents across a real system, where the work is bigger than a single function and the decisions carry architectural weight? The fundamentals you've built — decomposition, , architecture, testing — are exactly the tools.

Write a spec an agent can follow

Agents succeed in proportion to how well the work is scoped. A good spec, for an agent or a human, states:

  • The goal — what should be true when this is done.
  • The constraints — what to use, what to avoid, what conventions to match.
  • The contract — the inputs and outputs of the piece (interfaces again).
  • The definition of done — how success will be checked, ideally as tests.

This is decomposition wearing a different hat: you carve the system into pieces with clear contracts, and each piece becomes a spec an agent can execute.

Small, verifiable steps

Resist handing over a giant, vague goal. Break it into steps small enough that each produces a change you can actually read and check. A reviewable diff is a safe diff; a thousand-line diff hides bugs no matter who wrote it. Smaller steps also let you catch a wrong direction early, before it compounds.

The review loop

The rhythm that works:

  1. Specify a small, well-scoped piece.
  2. Let the agent implement it.
  3. Review — read the code, check the interfaces, run the tests, try an edge case (everything from the testing and reading-code lessons).
  4. Correct or accept, then move to the next piece.

Speed comes from the agent; correctness comes from the loop. The moment you stop reviewing, you stop understanding your own system — and you'll discover that at the worst possible time.

Keep judgement human

Some things you don't delegate:

  • Architecture — the boundaries and trade-offs that are expensive to reverse.
  • What "correct" means — the spec and the tests that define success.
  • The final accountability — an agent has no stake in the outcome; you do.

Delegate the typing and much of the implementation; keep the judgement.

Agents raise the ceiling for people with strong fundamentals and lower the floor for those without — because directing and reviewing them is the fundamentals, applied. That's the whole bet of this camp.

Where to go next

When something an agent (or you) wrote misbehaves, you need a method. Next: .

Finished reading? Mark it complete to track your progress.

On this page