Gorewood Logs

Council of Claudes

I've started spawning councils.

When I hit an architecture decision with genuine trade-offs—not "which library" but "what shape should this system take"—I don't ask Claude what it thinks. I make Claude argue with itself.

The pattern is inspired by Karpathy's llm-council—multi-perspective deliberation through role-assigned LLMs. I adapted it into a council skill for Agent Teams, Claude Code's experimental multi-agent feature that lets teammates communicate directly with each other.

I ran a council this week on how to capture richer context in Timbers, a new tool I've been building. The proposal seemed straightforward: add a notes field for agents to record their reasoning process. I spawned five councilors—an Advocate to champion the feature, a Skeptic to find holes, a Pragmatist to ground the discussion in what's actually shippable, and a Lead to moderate.

Then something unexpected happened. The Skeptic initially wanted structured sub-fields: "## Alternatives considered. ## Surprise." The Advocate argued that structured input produces listicle source material that templates have to un-listicle. The reframe: coaching should use questions ("What did you try that didn't work?") not headers. The Skeptic's objection narrowed to a single concrete ask—include BAD/GOOD examples in the coaching text. That landed. Vote: 3-0, ship it.

I didn't walk in expecting a debate about the shape of coaching. I expected a debate about whether the feature was worth building. The council found a different question to answer—and answered it better than I would have alone.

The key: diversity comes from different analytical frames, not different models. I'm not running GPT against Claude against Gemini hoping their training differences produce disagreement. I'm giving each Claude a distinct role with specific constraints. The Skeptic's job is to find problems. The Advocate's job is to defend possibilities. The tension is structural, not accidental.

When one Claude just agrees with another Claude, something's wrong—either the decision is trivial (why are you running a council?) or the roles need sharpening. Immediate consensus means reframe the question.

Is it expensive? Yes. Each councilor gets their own context window. A full deliberation burns tokens the way design reviews burn calendar time—meaningful cost for meaningful decisions. Use it rarely, for decisions that warrant multiple angles.

But when I'm about to make an architectural choice I'll live with for months—or even days, which to agents might as well be months—the cost of a council is noise. The cost of a wrong decision is real.

The absurdity isn't lost on me. I'm paying for AI to argue with itself because I've learned that one AI agreeing with me is less valuable than multiple AIs disagreeing with each other.

When one Claude agrees too easily, spawn more Claudes.

#ai-development #claude #vibe-coding #workflow