A production-ready pattern for splitting AI coding agent responsibilities across different models — fast models handle execution, powerful models handle reasoning.
┌─────────────────────────────────────────────────────────────────┐
│ ORCHESTRATOR (Router) │
│ (lightweight, stateless) │
└─────────────────────────────────────────────────────────────────┘
│ │ │
▼ ▼ ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ PHASE 1 │ │ PHASE 2 │ │ PHASE 3 │
│ Context Agent │ │ Planner Agent │ │ Impl Agents │
│ (fast model) │ │ (large model) │ │ (fast model) │
│ Claude Haiku │ │ Claude Opus │ │ Claude Sonnet │
└────────┬────────┘ └────────┬────────┘ └────────┬────────┘
│ │ │
▼ ▼ ▼
┌─────────────────────────────────────────────────────────────────┐
│ SHARED FILE MEMORY │
│ context.json │ plan.json │ impl_report.json │ review.json │
└─────────────────────────────────────────────────────────────────┘
│ │ │
└────────────────────┼────────────────────┘
▼
┌─────────────────┐
│ PHASE 4 │
│ Review Agent │
│ (large model) │
│ Claude Opus │
└─────────────────┘
Most AI coding workflows use a single model for everything. That's wasteful:
| Task | Needs | Best Model |
|---|---|---|
| Scanning files, gathering context | Speed | Haiku (fast, cheap) |
| Strategic planning, architecture | Deep reasoning | Opus (powerful) |
| Writing code from a clear spec | Speed + competence | Sonnet (balanced) |
| Reviewing for bugs & drift | Deep reasoning | Opus (powerful) |
This orchestrator routes each phase to the right model tier, keeping costs low and quality high.
.
├── orchestrator.py # Main router — thin dispatch table, plan locking
├── orchestrator_event_driven.py # Option 1: Event-driven dispatch (async)
├── orchestrator_hitl.py # Option 2: Human-in-the-loop plan approval
├── schemas.py # Pydantic contracts between agents
├── prompts/
│ ├── context.txt # System prompt for the context agent
│ ├── plan.txt # System prompt for the planner agent
│ ├── implement.txt # System prompt for implementation agents
│ └── review.txt # System prompt for the review agent
├── .github/
│ └── copilot-instructions.md # Teaches VS Code Copilot the full workflow
├── .gitignore
└── README.md
A fast model scans the codebase and produces a structured JSON summary: relevant files, dependencies, conventions, constraints, and scope boundaries. This context is saved to .agent_memory/context.json so no other agent needs to re-explore.
A powerful model reads the context and creates a detailed implementation plan with discrete tasks, file ownership boundaries, acceptance criteria, and parallelization groups. The plan is saved to .agent_memory/plan.json and a git checkpoint is created.
Fast models execute the plan. Tasks are grouped for parallel execution — each agent only touches its assigned files to prevent conflicts. Results are aggregated into .agent_memory/impl_report.json.
Plan locking: The plan file is set to read-only (
chmod 0444) for the entire duration of Phase 3. Implementers can read the plan but cannot mutate it — preventing plan drift during parallel execution. Write access is restored automatically when implementation finishes (or on error).
A powerful model reviews all changes against the plan and context. It checks for correctness, plan drift, code quality, and security issues. Based on the verdict, the orchestrator either merges, re-implements, or re-plans.
The review phase outputs a recommended_action that drives the orchestrator's next move:
| Action | Effect |
|---|---|
merge |
Workflow complete — all phases passed |
fix_and_re_review |
Clears impl + review, re-runs Phase 3 → 4 |
re_plan |
Clears plan + impl + review, re-runs Phase 2 → 3 → 4 |
escalate_to_human |
Halts for manual intervention |
|
This is the fastest way to use the orchestrator. No API keys. No |
| Step | What to do |
|---|---|
| 1 | Open this repo in VS Code |
| 2 | Confirm you have GitHub Copilot installed and active |
| 3 | Done. The .github/copilot-instructions.md file is picked up automatically — no configuration needed |
- Open Copilot Chat — press
Cmd+Shift+I(macOS) orCtrl+Shift+I(Windows/Linux) - Switch to Agent mode (click the mode dropdown at the top of the chat panel)
- Give it a task:
Refactor the authentication module
Copilot will execute the full 4-phase workflow automatically:
Phase 1 → Explore codebase → saves .agent_memory/context.json
Phase 2 → Create implementation → saves .agent_memory/plan.json
plan
Phase 3 → Implement each task → saves .agent_memory/impl_report.json
Phase 4 → Review all changes → saves .agent_memory/review.json
- Copilot reads
.github/copilot-instructions.mdon every conversation - The instructions teach it to follow Phases 1–4 in strict order
- Each phase produces structured JSON saved to
.agent_memory/ - The plan is treated as read-only during implementation (no drift)
- You can inspect any phase by opening the JSON files directly
| Tip | Example prompt |
|---|---|
| Approve the plan before impl starts | "Show me the plan before implementing" |
| Re-run a single phase | "Re-run the review phase on the current impl_report.json" |
| Narrow scope mid-flight | "Only implement tasks T1 and T3 from the plan" |
| Get a status check | "What phase are we on? Summarize progress so far." |
The prompt files in
prompts/are the fine-grained agent system prompts. The Copilot instructions in.github/copilot-instructions.mdare derived from them — same workflow, no code required.
Option B — Python Orchestrator (requires Anthropic API key)
Use this if you want programmatic control, CI integration, or to run the orchestrator outside of VS Code.
- Python 3.10+
- Pydantic (
pip install pydantic) - Anthropic Python SDK (
pip install anthropic)
# Clone the repo
git clone https://github.com/izzygld/Multi-Agent-Orchestrator-Workflow.git
cd Multi-Agent-Orchestrator-Workflow
# Install dependencies
pip install pydantic anthropic
# Set your API key
export ANTHROPIC_API_KEY="sk-ant-..."Important: The
call_agent()function inorchestrator.pyis currently a stub that returns"{}". To go live, uncomment the Anthropic API client code inside it.
# Run the full workflow (auto-detects which phase to start from)
python orchestrator.py --task "Refactor authentication module"
# Run a specific phase
python orchestrator.py --phase context --task "Add pagination to the API"
python orchestrator.py --phase plan
python orchestrator.py --phase implement
python orchestrator.py --phase review
# Reset all memory and start fresh
python orchestrator.py --reset --task "New task description"Key settings live at the top of orchestrator.py:
| Setting | Default | Description |
|---|---|---|
MEMORY_DIR |
.agent_memory/ |
Where phase outputs are stored |
CONTEXT_STALENESS_HOURS |
4 |
Hours before context is considered stale and re-gathered |
ModelTier.FAST |
claude-3-5-haiku-20241022 |
Model for context + implementation |
ModelTier.BALANCED |
claude-sonnet-4-20250514 |
Model for complex implementation |
ModelTier.POWERFUL |
claude-opus-4-20250514 |
Model for planning + review |
Every phase produces structured JSON validated by Pydantic. This eliminates ambiguity between agents.
{
"timestamp": "...",
"task_description": "...",
"relevant_files": [{"path": "...", "summary": "...", "relevance": "high", "line_count": 42}],
"dependencies": [{"name": "...", "version": "...", "usage": "..."}],
"existing_patterns": ["Uses dependency injection", "..."],
"constraints": ["Must support Python 3.9+"],
"files_in_scope": ["src/auth.py"],
"files_out_of_scope": ["tests/unrelated.py"]
}{
"timestamp": "...",
"plan_version": 1,
"approach_summary": "...",
"risks": ["Risk 1: ..."],
"tasks": [
{
"id": "T1",
"title": "...",
"description": "...",
"files_to_modify": ["src/auth.py"],
"files_to_create": [],
"depends_on": [],
"acceptance_criteria": ["Criterion 1"],
"estimated_complexity": "moderate"
}
],
"parallel_groups": [["T1", "T2"], ["T3"]]
}{
"task_id": "T1",
"status": "completed",
"changes": [{"path": "...", "change_type": "modified", "summary": "...", "lines_added": 25, "lines_removed": 10}],
"notes": "...",
"blockers": []
}{
"timestamp": "...",
"verdict": "approved",
"summary": "...",
"issues": [{"severity": "minor", "file": "...", "line": 42, "description": "...", "suggested_fix": "..."}],
"plan_drift_detected": false,
"drift_details": null,
"recommended_action": "merge"
}-
Stateless orchestrator — The router reads shared memory (JSON files), decides the next phase, and dispatches via a lookup table. No domain logic leaks into the main loop.
-
Structured handoffs — Every phase outputs well-defined JSON. Agents never pass free-text between each other.
-
Immutable plan during execution — The plan file is locked read-only while implementers run. This is enforced at the filesystem level (
chmod 0444), not just by convention. -
Parallel-safe implementation — The plan defines non-overlapping file scopes so multiple agents can implement simultaneously without conflicts.
-
Rollback checkpoints — Git commits are created after the plan and implementation phases, making it easy to revert if something goes wrong.
-
Right model for the job — Fast models (Haiku) for mechanical tasks, powerful models (Opus) for tasks requiring deep reasoning.
All inter-agent state lives in the .agent_memory/ directory:
| File | Written By | Read By |
|---|---|---|
context.json |
Context Agent | Planner, Implementers, Reviewer |
plan.json |
Planner Agent | Implementers, Reviewer |
impl_report.json |
Implementation Agents | Reviewer |
review.json |
Review Agent | Orchestrator (for next-action decision) |
The orchestrator never passes agent outputs directly — each phase reads from and writes to disk. This keeps context windows lean and makes debugging easy (just inspect the JSON files).
The call_agent() stub in orchestrator.py shows the structure. To go live:
from anthropic import Anthropic
client = Anthropic() # Uses ANTHROPIC_API_KEY env var
def call_agent(model, system_prompt, user_message, tools=None):
response = client.messages.create(
model=model.value,
max_tokens=8192,
system=system_prompt,
messages=[{"role": "user", "content": user_message}],
tools=tools or []
)
return response.content[0].textSet your API key:
export ANTHROPIC_API_KEY="sk-ant-..."The repo ships three orchestrator variants. Pick the one that fits your workflow.
Sequential phase loop with a dispatch table. Simplest to debug.
python orchestrator.py --task "Refactor authentication module"Each agent publishes a phase.done event; the orchestrator subscribes and triggers the next phase. Phases within a parallel group run with true asyncio.gather concurrency.
When to use: You want resilience, async execution, or plan to swap in a real event bus (Redis Pub/Sub, SQS, NATS).
python orchestrator_event_driven.py --task "Add caching layer"
python orchestrator_event_driven.py --task "..." --resetKey differences from the default:
| Feature | Default | Event-Driven |
|---|---|---|
| Execution model | Synchronous while True loop |
Async event subscriptions |
| Parallel impl tasks | Sequential (placeholder) | asyncio.gather per group |
| Failure handling | Exception propagates up | PhaseEvent.status = "failed" with error context |
| Extensibility | Add elif branches |
Subscribe new handlers to phase.done |
The event bus is an in-process EventBus class. To go distributed, replace it with your broker of choice — the PhaseEvent dataclass stays the same.
After the plan phase, the workflow pauses and surfaces a formatted plan summary for human approval. The reviewer can approve, request edits, or reject.
When to use: You want a safety net before burning tokens on implementation — especially useful for large or risky changes.
# Interactive — will prompt for approval
python orchestrator_hitl.py --task "Migrate database schema"
# Auto-approve (for CI or scripted runs)
python orchestrator_hitl.py --task "..." --auto-approve
# Reset and start fresh
python orchestrator_hitl.py --task "..." --resetThe approval gate shows:
- Approach summary and risks
- Every task with complexity badge, file scope, and acceptance criteria
- Parallel execution order
Options at the gate:
| Key | Action |
|---|---|
a |
Approve — lock the plan and proceed to implementation |
e |
Edit — exit so you can modify plan.json by hand, then re-run |
r |
Reject — abort the workflow entirely |
v |
View — dump the full raw JSON for inspection |
- Add the phase to the
Phaseenum inorchestrator.py - Create a system prompt in
prompts/ - Define the output schema in
schemas.py - Write a
run_<phase>_phase()handler - Add it to the
PHASE_HANDLERSdispatch table - Update
determine_next_phase()with the new state transition
Replace the EventBus class in orchestrator_event_driven.py with your broker client. The PhaseEvent dataclass is the only contract — publish it as JSON to any topic/queue.
Edit _format_plan_summary() in orchestrator_hitl.py to change what the human sees. Add new options to approve_plan() if you want finer-grained control (e.g., approve individual tasks).
MIT