agentic-coding · 2026-05-17

Conductor v8.1.0 — How 13 plugins become an orchestra

conductoragentic-codingclaude-codepluginsindie-builder

TL;DR

Conductor v8.1.0 is my productive agentic coding stack: 13 specialized Claude Code plugins, 55+ commands, one orchestrator plugin acting as the baton. I run 8 parallel projects across 424 repositories with it. After 8 major versions in 2 years it is no longer an experiment — it is backbone. This post explains the architecture, what has changed since v1.0, and which three plugins move the needle most: conductor, auto-test, finalizer. The stack itself is private (repo moinsen-dev/claude_code_conductor); the plugin patterns are portable.

What is Conductor?

Conductor is a plugin ecosystem for Claude Code, Anthropic’s official CLI for coding agents. Anthropic documents plugins as “scoped sets of commands, agents, hooks, and MCP servers” — exactly the five extension points the October 2025 Claude Code plugin launch made public. Conductor uses all five.

The idea: one plugin per phase of the software lifecycle. The PRD phase has its own commands, design has its own, tests have their own, release has its own. Instead of maintaining a monolithic mega-prompt, each step is a plugin with its own context and its own sub-agents.

After 8 major versions — v1.0 ran in May 2024, v8.1.0 runs in May 2026 — I have learned: granularity is the architecture. A plugin that is too coarse loses context; a plugin that is too fine costs more orchestration overhead than it saves.

The 13 plugins, sorted by leverage

PluginWhat it doesWhen I use it
conductor55+ commands for PRD, UX, UI, API and DB designPhase 1-3 of every project
customerLifecycle workflows: proposals, UAT, onboardingAs soon as a customer is involved
auto-testTest generation + maintenance in parallel with implementationAfter every feature-flag-ready commit
finalizerRelease preparation: changelog, version bump, GitHub releaseBefore every push to main
upgrade-pilotDependency and version maintenanceWeekly, automated via cron
dart-lspFlutter/Dart-specific LSP integrationPermanent in Flutter repos
git-pilotGit workflow automation (branches, merges, conflicts)On every non-trivial commit
knowhowProject-internal knowledge managementWhen returning to a project after 2+ weeks
brand-pilotBrand-consistent copy and assetsBefore every public release
marketing-pilotRelease-to-marketing pipeline (Show HN, LinkedIn, newsletter)Right after a release cut
3 moreExperimental, in conductor-studio and claude-code-conductor separatelyProject-dependent

The order is the order of value per hour: conductor and auto-test save the most time, marketing-pilot only pays off at release time.

Why 13 instead of 1?

Three reasons, all learned the hard way.

1. The context window is a bottleneck

Claude Code with Claude Opus 4.7 currently has a 1-million-token context (Anthropic, Claude models overview, as of February 2026). That sounds like a lot — until you open a 100k-line Flutter project and want to keep all relevant files in scope. Plugins enable scoped contexts: the auto-test plugin loads only test files and the sources under test, not the whole repo. This is the only way to keep working in a project with 800+ files without responses getting slower and fuzzier over time.

Anthropic explicitly recommends this approach: “Plugins offer a clean way to bundle and share extensions in formats you already use” (Mike Krieger, Chief Product Officer, Anthropic, October 2025).

2. Sub-agents beat mega-prompt

Anthropic’s official documentation (Engineering Blog, “Effective Context Engineering for AI Agents”, September 2025) is direct: “Sub-agents inherit fresh contexts and tools.” A Conductor plugin is exactly that — a sub-agent with a defined purpose, prompt, and toolbox. When auto-test writes a test it does not know what conductor did with the PRD two steps earlier. That is a feature, not a bug: less cross-talk, clearer responsibility.

3. Hooks force discipline

Claude Code supports lifecycle hooks since the plugin system: PreToolUse, PostToolUse, Stop. Conductor uses finalizer hooks before every push to main — if the build is red or tests are missing, the hook blocks. This is the only way to run 8 parallel projects without constantly shipping production bugs.

What changed between v1.0 and v8.1.0

The honest version: every major version broke one old design principle.

VersionDateBreak
v1.0May 2024Monolithic mega plugin. Worked for 3 weeks.
v2.0July 2024First split into PRD and code plugins. Duplicate sub-agent calls.
v3.0October 2024Dedicated conductor plugin as orchestrator. First time: plugin calls plugin.
v4.0January 2025auto-test as a separate parallel plugin. Largest productivity jump.
v5.0April 2025Attempt to unify conductor itself — failed, rollback after 2 weeks.
v6.0August 2025MCP server for shared state between plugins.
v7.0December 2025Hook system for quality gates before push.
v8.0March 2026New orchestrator: deterministic instead of LLM-driven.
v8.1.0May 2026Bug fixes on v8.0, plus marketing-pilot moved from experimental to production.

The two lessons from those 8 iterations:

  1. Granularity beats cleverness. v5.0 was the attempt to build a clever mega-plugin that decided everything. v6.0 was the admission that dumb separate plugins are better.
  2. Hooks beat conventions. v7.0 codified what until then was “I remember to run the tests before pushing” into actual code. That is the point where I stopped needing to personally babysit a production pipeline.

How I actually use Conductor

A typical day in an active project:

  1. 06:30 morning: claude --plugin knowhow project-status — fresh context, what changed since yesterday.
  2. 07:00: claude --plugin conductor next-task — Conductor picks the next task from the PRD, checks dependencies, proposes it.
  3. 07:00 to 10:00: Implementation. auto-test runs in the background with --watch and writes tests in parallel.
  4. 10:00: claude --plugin finalizer pre-push — checks: build green? lint clean? tests green? If not, it blocks.
  5. 10:01: Push, then claude --plugin marketing-pilot release-notes if major.

That is the choreography for one project. With 8 in parallel, the same pattern runs in 8 repos. Conductor knows via its knowhow plugin where each project currently stands — otherwise context switching would be the killer.

What it cannot do (limitations)

So this post is not read as marketing, the honest limits:

  • Conductor is private. Repo moinsen-dev/claude_code_conductor is not public. The patterns described here are transferable; the actual code is not (yet).
  • Conductor is tailored to Claude Code. OpenCode, Codex CLI, Cursor — each has its own plugin system. Porting Conductor logic means re-implementation, not copy.
  • Conductor does not solve a product problem. It solves a build-process problem. If the PRD input is bad, 13 plugins will build a bad outcome faster.
  • MCP performance is an open question. Anthropic’s Model Context Protocol (modelcontextprotocol.io) is the basis for plugin-to-plugin state, but 200-500 ms per call adds up in workflows with 15+ steps.

When does something like this make sense for someone else?

Three conditions, all three must hold:

  1. You build 5+ active projects in parallel (otherwise plugin overhead exceeds time saved).
  2. You use Claude Code as your primary agent (otherwise the plugin slots are in the wrong system).
  3. You are ready to refactor a major version every 3-4 months (otherwise an architecture freezes that no longer matches your changing workflow).

If those three are not all true: smaller stacks are often better. A single good custom command does a lot.

Where it goes next

Conductor v9 is already in planning. Three topics we explicitly did not solve in v8.1.0:

  • Cross-repo refactoring: when the same change must happen in 12 repos simultaneously, the current stack breaks down.
  • Long-running background jobs: tests that run 30+ minutes currently block too early.
  • Multi-LLM orchestration: Claude Code as primary, with Codex and Kimi called for specific sub-tasks. v8 does this ad hoc; v9 should do it declaratively.

If you also work with Claude Code plugins or build a similar stack: reach me on LinkedIn or GitHub. I trade notes happily.

FAQ

What is Conductor in one sentence?

A plugin ecosystem for Claude Code that maps the full software lifecycle of an indie builder — from PRD through implementation and tests to release — across 13 specialized plugins.

Which Claude Code version do I need?

Plugins have been supported since Claude Code 1.0.107 (Anthropic, “Claude Code plugins”, October 2025). Conductor v8.1.0 requires at least that version and currently runs against Claude Opus 4.7 with the 1-million-token context window.

Why not put everything in one plugin?

Because Claude Code then loads the entire plugin context with every command. With 13 plugins it only loads the relevant sub-context and reserves the token budget for actual work. Anthropic documents this in the plugins overview as a deliberate design choice.

How long did v8.0 to v8.1.0 take?

2 weeks of pure iteration. v8.0 to v8.1.0 was not an architecture change but stabilization and moving marketing-pilot to production.

Will Conductor ever be open source?

Probably not in its current form — too much is tuned to my specific indie builder workflow. But individual plugins that generalize well (auto-test, finalizer) may ship as separate open-source releases in 2026 or 2027. Concrete plans in Now.


Written on May 17, 2026 in Hamburg. If you find this post useful, link to it — that helps others find it too.