Multi-Agent Orchestrator Workflow

A production-ready pattern for splitting AI coding agent responsibilities across different models — fast models handle execution, powerful models handle reasoning.

┌─────────────────────────────────────────────────────────────────┐
│                      ORCHESTRATOR (Router)                      │
│                    (lightweight, stateless)                     │
└─────────────────────────────────────────────────────────────────┘
         │                    │                    │
         ▼                    ▼                    ▼
┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐
│  PHASE 1        │  │  PHASE 2        │  │  PHASE 3        │
│  Context Agent  │  │  Planner Agent  │  │  Impl Agents    │
│  (fast model)   │  │  (large model)  │  │  (fast model)   │
│  Claude Haiku   │  │  Claude Opus    │  │  Claude Sonnet  │
└────────┬────────┘  └────────┬────────┘  └────────┬────────┘
         │                    │                    │
         ▼                    ▼                    ▼
┌─────────────────────────────────────────────────────────────────┐
│                    SHARED FILE MEMORY                           │
│  context.json │ plan.json │ impl_report.json │ review.json     │
└─────────────────────────────────────────────────────────────────┘
         │                    │                    │
         └────────────────────┼────────────────────┘
                              ▼
                    ┌─────────────────┐
                    │  PHASE 4        │
                    │  Review Agent   │
                    │  (large model)  │
                    │  Claude Opus    │
                    └─────────────────┘

Why This Pattern?

Most AI coding workflows use a single model for everything. That's wasteful:

Task	Needs	Best Model
Scanning files, gathering context	Speed	Haiku (fast, cheap)
Strategic planning, architecture	Deep reasoning	Opus (powerful)
Writing code from a clear spec	Speed + competence	Sonnet (balanced)
Reviewing for bugs & drift	Deep reasoning	Opus (powerful)

This orchestrator routes each phase to the right model tier, keeping costs low and quality high.

Repository Structure

.
├── orchestrator.py              # Main router — thin dispatch table, plan locking
├── orchestrator_event_driven.py # Option 1: Event-driven dispatch (async)
├── orchestrator_hitl.py         # Option 2: Human-in-the-loop plan approval
├── schemas.py                   # Pydantic contracts between agents
├── prompts/
│   ├── context.txt              # System prompt for the context agent
│   ├── plan.txt                 # System prompt for the planner agent
│   ├── implement.txt            # System prompt for implementation agents
│   └── review.txt               # System prompt for the review agent
├── .github/
│   └── copilot-instructions.md  # Teaches VS Code Copilot the full workflow
├── .gitignore
└── README.md

How It Works

The Four Phases

Phase 1 — Context Gathering (Haiku)

A fast model scans the codebase and produces a structured JSON summary: relevant files, dependencies, conventions, constraints, and scope boundaries. This context is saved to .agent_memory/context.json so no other agent needs to re-explore.

Phase 2 — Planning (Opus)

A powerful model reads the context and creates a detailed implementation plan with discrete tasks, file ownership boundaries, acceptance criteria, and parallelization groups. The plan is saved to .agent_memory/plan.json and a git checkpoint is created.

Phase 3 — Implementation (Sonnet)

Fast models execute the plan. Tasks are grouped for parallel execution — each agent only touches its assigned files to prevent conflicts. Results are aggregated into .agent_memory/impl_report.json.

Plan locking: The plan file is set to read-only (chmod 0444) for the entire duration of Phase 3. Implementers can read the plan but cannot mutate it — preventing plan drift during parallel execution. Write access is restored automatically when implementation finishes (or on error).

Phase 4 — Review (Opus)

A powerful model reviews all changes against the plan and context. It checks for correctness, plan drift, code quality, and security issues. Based on the verdict, the orchestrator either merges, re-implements, or re-plans.

Feedback Loops

The review phase outputs a recommended_action that drives the orchestrator's next move:

Action	Effect
`merge`	Workflow complete — all phases passed
`fix_and_re_review`	Clears impl + review, re-runs Phase 3 → 4
`re_plan`	Clears plan + impl + review, re-runs Phase 2 → 3 → 4
`escalate_to_human`	Halts for manual intervention

Quick Start

⭐ Recommended: Use VS Code Copilot — no API key, no setup, just open and go.

🚀 Option A — VS Code Copilot (no API key needed)

This is the fastest way to use the orchestrator. No API keys. No pip install. No environment variables. Just open the repo in VS Code and start talking to Copilot.

3 Steps to Get Running

Step	What to do
1	Open this repo in VS Code
2	Confirm you have GitHub Copilot installed and active
3	Done. The `.github/copilot-instructions.md` file is picked up automatically — no configuration needed

Running the Workflow

Open Copilot Chat — press Cmd+Shift+I (macOS) or Ctrl+Shift+I (Windows/Linux)
Switch to Agent mode (click the mode dropdown at the top of the chat panel)
Give it a task:

Refactor the authentication module

Copilot will execute the full 4-phase workflow automatically:

Phase 1  →  Explore codebase        →  saves .agent_memory/context.json
Phase 2  →  Create implementation   →  saves .agent_memory/plan.json
             plan
Phase 3  →  Implement each task     →  saves .agent_memory/impl_report.json
Phase 4  →  Review all changes      →  saves .agent_memory/review.json

What Happens Under the Hood

Copilot reads .github/copilot-instructions.md on every conversation
The instructions teach it to follow Phases 1–4 in strict order
Each phase produces structured JSON saved to .agent_memory/
The plan is treated as read-only during implementation (no drift)
You can inspect any phase by opening the JSON files directly

Pro Tips

Tip	Example prompt
Approve the plan before impl starts	"Show me the plan before implementing"
Re-run a single phase	"Re-run the review phase on the current impl_report.json"
Narrow scope mid-flight	"Only implement tasks T1 and T3 from the plan"
Get a status check	"What phase are we on? Summarize progress so far."

The prompt files in prompts/ are the fine-grained agent system prompts. The Copilot instructions in .github/copilot-instructions.md are derived from them — same workflow, no code required.

Option B — Python Orchestrator (requires Anthropic API key)

Use this if you want programmatic control, CI integration, or to run the orchestrator outside of VS Code.

Prerequisites

Python 3.10+
Pydantic (pip install pydantic)
Anthropic Python SDK (pip install anthropic)

Installation

# Clone the repo
git clone https://github.com/izzygld/Multi-Agent-Orchestrator-Workflow.git
cd Multi-Agent-Orchestrator-Workflow

# Install dependencies
pip install pydantic anthropic

# Set your API key
export ANTHROPIC_API_KEY="sk-ant-..."

Important: The call_agent() function in orchestrator.py is currently a stub that returns "{}". To go live, uncomment the Anthropic API client code inside it.

Usage

# Run the full workflow (auto-detects which phase to start from)
python orchestrator.py --task "Refactor authentication module"

# Run a specific phase
python orchestrator.py --phase context --task "Add pagination to the API"
python orchestrator.py --phase plan
python orchestrator.py --phase implement
python orchestrator.py --phase review

# Reset all memory and start fresh
python orchestrator.py --reset --task "New task description"

Configuration

Key settings live at the top of orchestrator.py:

Setting	Default	Description
`MEMORY_DIR`	`.agent_memory/`	Where phase outputs are stored
`CONTEXT_STALENESS_HOURS`	`4`	Hours before context is considered stale and re-gathered
`ModelTier.FAST`	`claude-3-5-haiku-20241022`	Model for context + implementation
`ModelTier.BALANCED`	`claude-sonnet-4-20250514`	Model for complex implementation
`ModelTier.POWERFUL`	`claude-opus-4-20250514`	Model for planning + review

Schemas (Inter-Agent Contracts)

Every phase produces structured JSON validated by Pydantic. This eliminates ambiguity between agents.

`ContextOutput` — Phase 1 output

{
    "timestamp": "...",
    "task_description": "...",
    "relevant_files": [{"path": "...", "summary": "...", "relevance": "high", "line_count": 42}],
    "dependencies": [{"name": "...", "version": "...", "usage": "..."}],
    "existing_patterns": ["Uses dependency injection", "..."],
    "constraints": ["Must support Python 3.9+"],
    "files_in_scope": ["src/auth.py"],
    "files_out_of_scope": ["tests/unrelated.py"]
}

`PlanOutput` — Phase 2 output

{
    "timestamp": "...",
    "plan_version": 1,
    "approach_summary": "...",
    "risks": ["Risk 1: ..."],
    "tasks": [
        {
            "id": "T1",
            "title": "...",
            "description": "...",
            "files_to_modify": ["src/auth.py"],
            "files_to_create": [],
            "depends_on": [],
            "acceptance_criteria": ["Criterion 1"],
            "estimated_complexity": "moderate"
        }
    ],
    "parallel_groups": [["T1", "T2"], ["T3"]]
}

`TaskResult` — Phase 3 output (per task)

{
    "task_id": "T1",
    "status": "completed",
    "changes": [{"path": "...", "change_type": "modified", "summary": "...", "lines_added": 25, "lines_removed": 10}],
    "notes": "...",
    "blockers": []
}

`ReviewOutput` — Phase 4 output

{
    "timestamp": "...",
    "verdict": "approved",
    "summary": "...",
    "issues": [{"severity": "minor", "file": "...", "line": 42, "description": "...", "suggested_fix": "..."}],
    "plan_drift_detected": false,
    "drift_details": null,
    "recommended_action": "merge"
}

Key Design Principles

Stateless orchestrator — The router reads shared memory (JSON files), decides the next phase, and dispatches via a lookup table. No domain logic leaks into the main loop.
Structured handoffs — Every phase outputs well-defined JSON. Agents never pass free-text between each other.
Immutable plan during execution — The plan file is locked read-only while implementers run. This is enforced at the filesystem level (chmod 0444), not just by convention.
Parallel-safe implementation — The plan defines non-overlapping file scopes so multiple agents can implement simultaneously without conflicts.
Rollback checkpoints — Git commits are created after the plan and implementation phases, making it easy to revert if something goes wrong.
Right model for the job — Fast models (Haiku) for mechanical tasks, powerful models (Opus) for tasks requiring deep reasoning.

Shared Memory

All inter-agent state lives in the .agent_memory/ directory:

File	Written By	Read By
`context.json`	Context Agent	Planner, Implementers, Reviewer
`plan.json`	Planner Agent	Implementers, Reviewer
`impl_report.json`	Implementation Agents	Reviewer
`review.json`	Review Agent	Orchestrator (for next-action decision)

The orchestrator never passes agent outputs directly — each phase reads from and writes to disk. This keeps context windows lean and makes debugging easy (just inspect the JSON files).

Connecting to the Anthropic API

The call_agent() stub in orchestrator.py shows the structure. To go live:

from anthropic import Anthropic

client = Anthropic()  # Uses ANTHROPIC_API_KEY env var

def call_agent(model, system_prompt, user_message, tools=None):
    response = client.messages.create(
        model=model.value,
        max_tokens=8192,
        system=system_prompt,
        messages=[{"role": "user", "content": user_message}],
        tools=tools or []
    )
    return response.content[0].text

Set your API key:

export ANTHROPIC_API_KEY="sk-ant-..."

Dispatch Options

The repo ships three orchestrator variants. Pick the one that fits your workflow.

Default — Synchronous Polling (`orchestrator.py`)

Sequential phase loop with a dispatch table. Simplest to debug.

python orchestrator.py --task "Refactor authentication module"

Option 1 — Event-Driven Dispatch (`orchestrator_event_driven.py`)

Each agent publishes a phase.done event; the orchestrator subscribes and triggers the next phase. Phases within a parallel group run with true asyncio.gather concurrency.

When to use: You want resilience, async execution, or plan to swap in a real event bus (Redis Pub/Sub, SQS, NATS).

python orchestrator_event_driven.py --task "Add caching layer"
python orchestrator_event_driven.py --task "..." --reset

Key differences from the default:

Feature	Default	Event-Driven
Execution model	Synchronous `while True` loop	Async event subscriptions
Parallel impl tasks	Sequential (placeholder)	`asyncio.gather` per group
Failure handling	Exception propagates up	`PhaseEvent.status = "failed"` with error context
Extensibility	Add `elif` branches	Subscribe new handlers to `phase.done`

The event bus is an in-process EventBus class. To go distributed, replace it with your broker of choice — the PhaseEvent dataclass stays the same.

Option 2 — Human-in-the-Loop Gates (`orchestrator_hitl.py`)

After the plan phase, the workflow pauses and surfaces a formatted plan summary for human approval. The reviewer can approve, request edits, or reject.

When to use: You want a safety net before burning tokens on implementation — especially useful for large or risky changes.

# Interactive — will prompt for approval
python orchestrator_hitl.py --task "Migrate database schema"

# Auto-approve (for CI or scripted runs)
python orchestrator_hitl.py --task "..." --auto-approve

# Reset and start fresh
python orchestrator_hitl.py --task "..." --reset

The approval gate shows:

Approach summary and risks
Every task with complexity badge, file scope, and acceptance criteria
Parallel execution order

Options at the gate:

Key	Action
`a`	Approve — lock the plan and proceed to implementation
`e`	Edit — exit so you can modify `plan.json` by hand, then re-run
`r`	Reject — abort the workflow entirely
`v`	View — dump the full raw JSON for inspection

Extending the Pattern

Adding a new phase

Add the phase to the Phase enum in orchestrator.py
Create a system prompt in prompts/
Define the output schema in schemas.py
Write a run_<phase>_phase() handler
Add it to the PHASE_HANDLERS dispatch table
Update determine_next_phase() with the new state transition

Swapping the event bus (Option 1)

Replace the EventBus class in orchestrator_event_driven.py with your broker client. The PhaseEvent dataclass is the only contract — publish it as JSON to any topic/queue.

Customizing the approval gate (Option 2)

Edit _format_plan_summary() in orchestrator_hitl.py to change what the human sees. Add new options to approve_plan() if you want finer-grained control (e.g., approve individual tasks).

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github		.github
prompts		prompts
.gitignore		.gitignore
README.md		README.md
orchestrator.py		orchestrator.py
orchestrator_event_driven.py		orchestrator_event_driven.py
orchestrator_hitl.py		orchestrator_hitl.py
schemas.py		schemas.py

Folders and files

Latest commit

History

Repository files navigation

Multi-Agent Orchestrator Workflow

Why This Pattern?

Repository Structure

How It Works

The Four Phases

Phase 1 — Context Gathering (Haiku)

Phase 2 — Planning (Opus)

Phase 3 — Implementation (Sonnet)

Phase 4 — Review (Opus)

Feedback Loops

Quick Start

⭐ Recommended: Use VS Code Copilot — no API key, no setup, just open and go.

🚀 Option A — VS Code Copilot (no API key needed)

3 Steps to Get Running

Running the Workflow

What Happens Under the Hood

Pro Tips

Prerequisites

Installation

Usage

Configuration

Schemas (Inter-Agent Contracts)

ContextOutput — Phase 1 output

PlanOutput — Phase 2 output

TaskResult — Phase 3 output (per task)

ReviewOutput — Phase 4 output

Key Design Principles

Shared Memory

Connecting to the Anthropic API

Dispatch Options

Default — Synchronous Polling (orchestrator.py)

Option 1 — Event-Driven Dispatch (orchestrator_event_driven.py)

Option 2 — Human-in-the-Loop Gates (orchestrator_hitl.py)

Extending the Pattern

Adding a new phase

Swapping the event bus (Option 1)

Customizing the approval gate (Option 2)

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`ContextOutput` — Phase 1 output

`PlanOutput` — Phase 2 output

`TaskResult` — Phase 3 output (per task)

`ReviewOutput` — Phase 4 output

Default — Synchronous Polling (`orchestrator.py`)

Option 1 — Event-Driven Dispatch (`orchestrator_event_driven.py`)

Option 2 — Human-in-the-Loop Gates (`orchestrator_hitl.py`)

Packages