Skip to content

izzygld/Multi-Agent-Orchestrator-Workflow

Repository files navigation

Multi-Agent Orchestrator Workflow

A production-ready pattern for splitting AI coding agent responsibilities across different models — fast models handle execution, powerful models handle reasoning.

┌─────────────────────────────────────────────────────────────────┐
│                      ORCHESTRATOR (Router)                      │
│                    (lightweight, stateless)                     │
└─────────────────────────────────────────────────────────────────┘
         │                    │                    │
         ▼                    ▼                    ▼
┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐
│  PHASE 1        │  │  PHASE 2        │  │  PHASE 3        │
│  Context Agent  │  │  Planner Agent  │  │  Impl Agents    │
│  (fast model)   │  │  (large model)  │  │  (fast model)   │
│  Claude Haiku   │  │  Claude Opus    │  │  Claude Sonnet  │
└────────┬────────┘  └────────┬────────┘  └────────┬────────┘
         │                    │                    │
         ▼                    ▼                    ▼
┌─────────────────────────────────────────────────────────────────┐
│                    SHARED FILE MEMORY                           │
│  context.json │ plan.json │ impl_report.json │ review.json     │
└─────────────────────────────────────────────────────────────────┘
         │                    │                    │
         └────────────────────┼────────────────────┘
                              ▼
                    ┌─────────────────┐
                    │  PHASE 4        │
                    │  Review Agent   │
                    │  (large model)  │
                    │  Claude Opus    │
                    └─────────────────┘

Why This Pattern?

Most AI coding workflows use a single model for everything. That's wasteful:

Task Needs Best Model
Scanning files, gathering context Speed Haiku (fast, cheap)
Strategic planning, architecture Deep reasoning Opus (powerful)
Writing code from a clear spec Speed + competence Sonnet (balanced)
Reviewing for bugs & drift Deep reasoning Opus (powerful)

This orchestrator routes each phase to the right model tier, keeping costs low and quality high.


Repository Structure

.
├── orchestrator.py              # Main router — thin dispatch table, plan locking
├── orchestrator_event_driven.py # Option 1: Event-driven dispatch (async)
├── orchestrator_hitl.py         # Option 2: Human-in-the-loop plan approval
├── schemas.py                   # Pydantic contracts between agents
├── prompts/
│   ├── context.txt              # System prompt for the context agent
│   ├── plan.txt                 # System prompt for the planner agent
│   ├── implement.txt            # System prompt for implementation agents
│   └── review.txt               # System prompt for the review agent
├── .github/
│   └── copilot-instructions.md  # Teaches VS Code Copilot the full workflow
├── .gitignore
└── README.md

How It Works

The Four Phases

Phase 1 — Context Gathering (Haiku)

A fast model scans the codebase and produces a structured JSON summary: relevant files, dependencies, conventions, constraints, and scope boundaries. This context is saved to .agent_memory/context.json so no other agent needs to re-explore.

Phase 2 — Planning (Opus)

A powerful model reads the context and creates a detailed implementation plan with discrete tasks, file ownership boundaries, acceptance criteria, and parallelization groups. The plan is saved to .agent_memory/plan.json and a git checkpoint is created.

Phase 3 — Implementation (Sonnet)

Fast models execute the plan. Tasks are grouped for parallel execution — each agent only touches its assigned files to prevent conflicts. Results are aggregated into .agent_memory/impl_report.json.

Plan locking: The plan file is set to read-only (chmod 0444) for the entire duration of Phase 3. Implementers can read the plan but cannot mutate it — preventing plan drift during parallel execution. Write access is restored automatically when implementation finishes (or on error).

Phase 4 — Review (Opus)

A powerful model reviews all changes against the plan and context. It checks for correctness, plan drift, code quality, and security issues. Based on the verdict, the orchestrator either merges, re-implements, or re-plans.

Feedback Loops

The review phase outputs a recommended_action that drives the orchestrator's next move:

Action Effect
merge Workflow complete — all phases passed
fix_and_re_review Clears impl + review, re-runs Phase 3 → 4
re_plan Clears plan + impl + review, re-runs Phase 2 → 3 → 4
escalate_to_human Halts for manual intervention

Quick Start

⭐ Recommended: Use VS Code Copilot — no API key, no setup, just open and go.


🚀 Option A — VS Code Copilot (no API key needed)

This is the fastest way to use the orchestrator. No API keys. No pip install. No environment variables. Just open the repo in VS Code and start talking to Copilot.

3 Steps to Get Running

Step What to do
1 Open this repo in VS Code
2 Confirm you have GitHub Copilot installed and active
3 Done. The .github/copilot-instructions.md file is picked up automatically — no configuration needed

Running the Workflow

  1. Open Copilot Chat — press Cmd+Shift+I (macOS) or Ctrl+Shift+I (Windows/Linux)
  2. Switch to Agent mode (click the mode dropdown at the top of the chat panel)
  3. Give it a task:
Refactor the authentication module

Copilot will execute the full 4-phase workflow automatically:

Phase 1  →  Explore codebase        →  saves .agent_memory/context.json
Phase 2  →  Create implementation   →  saves .agent_memory/plan.json
             plan
Phase 3  →  Implement each task     →  saves .agent_memory/impl_report.json
Phase 4  →  Review all changes      →  saves .agent_memory/review.json

What Happens Under the Hood

  • Copilot reads .github/copilot-instructions.md on every conversation
  • The instructions teach it to follow Phases 1–4 in strict order
  • Each phase produces structured JSON saved to .agent_memory/
  • The plan is treated as read-only during implementation (no drift)
  • You can inspect any phase by opening the JSON files directly

Pro Tips

Tip Example prompt
Approve the plan before impl starts "Show me the plan before implementing"
Re-run a single phase "Re-run the review phase on the current impl_report.json"
Narrow scope mid-flight "Only implement tasks T1 and T3 from the plan"
Get a status check "What phase are we on? Summarize progress so far."

The prompt files in prompts/ are the fine-grained agent system prompts. The Copilot instructions in .github/copilot-instructions.md are derived from them — same workflow, no code required.


Option B — Python Orchestrator (requires Anthropic API key)

Use this if you want programmatic control, CI integration, or to run the orchestrator outside of VS Code.

Prerequisites

Installation

# Clone the repo
git clone https://github.com/izzygld/Multi-Agent-Orchestrator-Workflow.git
cd Multi-Agent-Orchestrator-Workflow

# Install dependencies
pip install pydantic anthropic

# Set your API key
export ANTHROPIC_API_KEY="sk-ant-..."

Important: The call_agent() function in orchestrator.py is currently a stub that returns "{}". To go live, uncomment the Anthropic API client code inside it.

Usage

# Run the full workflow (auto-detects which phase to start from)
python orchestrator.py --task "Refactor authentication module"

# Run a specific phase
python orchestrator.py --phase context --task "Add pagination to the API"
python orchestrator.py --phase plan
python orchestrator.py --phase implement
python orchestrator.py --phase review

# Reset all memory and start fresh
python orchestrator.py --reset --task "New task description"

Configuration

Key settings live at the top of orchestrator.py:

Setting Default Description
MEMORY_DIR .agent_memory/ Where phase outputs are stored
CONTEXT_STALENESS_HOURS 4 Hours before context is considered stale and re-gathered
ModelTier.FAST claude-3-5-haiku-20241022 Model for context + implementation
ModelTier.BALANCED claude-sonnet-4-20250514 Model for complex implementation
ModelTier.POWERFUL claude-opus-4-20250514 Model for planning + review

Schemas (Inter-Agent Contracts)

Every phase produces structured JSON validated by Pydantic. This eliminates ambiguity between agents.

ContextOutput — Phase 1 output

{
    "timestamp": "...",
    "task_description": "...",
    "relevant_files": [{"path": "...", "summary": "...", "relevance": "high", "line_count": 42}],
    "dependencies": [{"name": "...", "version": "...", "usage": "..."}],
    "existing_patterns": ["Uses dependency injection", "..."],
    "constraints": ["Must support Python 3.9+"],
    "files_in_scope": ["src/auth.py"],
    "files_out_of_scope": ["tests/unrelated.py"]
}

PlanOutput — Phase 2 output

{
    "timestamp": "...",
    "plan_version": 1,
    "approach_summary": "...",
    "risks": ["Risk 1: ..."],
    "tasks": [
        {
            "id": "T1",
            "title": "...",
            "description": "...",
            "files_to_modify": ["src/auth.py"],
            "files_to_create": [],
            "depends_on": [],
            "acceptance_criteria": ["Criterion 1"],
            "estimated_complexity": "moderate"
        }
    ],
    "parallel_groups": [["T1", "T2"], ["T3"]]
}

TaskResult — Phase 3 output (per task)

{
    "task_id": "T1",
    "status": "completed",
    "changes": [{"path": "...", "change_type": "modified", "summary": "...", "lines_added": 25, "lines_removed": 10}],
    "notes": "...",
    "blockers": []
}

ReviewOutput — Phase 4 output

{
    "timestamp": "...",
    "verdict": "approved",
    "summary": "...",
    "issues": [{"severity": "minor", "file": "...", "line": 42, "description": "...", "suggested_fix": "..."}],
    "plan_drift_detected": false,
    "drift_details": null,
    "recommended_action": "merge"
}

Key Design Principles

  1. Stateless orchestrator — The router reads shared memory (JSON files), decides the next phase, and dispatches via a lookup table. No domain logic leaks into the main loop.

  2. Structured handoffs — Every phase outputs well-defined JSON. Agents never pass free-text between each other.

  3. Immutable plan during execution — The plan file is locked read-only while implementers run. This is enforced at the filesystem level (chmod 0444), not just by convention.

  4. Parallel-safe implementation — The plan defines non-overlapping file scopes so multiple agents can implement simultaneously without conflicts.

  5. Rollback checkpoints — Git commits are created after the plan and implementation phases, making it easy to revert if something goes wrong.

  6. Right model for the job — Fast models (Haiku) for mechanical tasks, powerful models (Opus) for tasks requiring deep reasoning.


Shared Memory

All inter-agent state lives in the .agent_memory/ directory:

File Written By Read By
context.json Context Agent Planner, Implementers, Reviewer
plan.json Planner Agent Implementers, Reviewer
impl_report.json Implementation Agents Reviewer
review.json Review Agent Orchestrator (for next-action decision)

The orchestrator never passes agent outputs directly — each phase reads from and writes to disk. This keeps context windows lean and makes debugging easy (just inspect the JSON files).


Connecting to the Anthropic API

The call_agent() stub in orchestrator.py shows the structure. To go live:

from anthropic import Anthropic

client = Anthropic()  # Uses ANTHROPIC_API_KEY env var

def call_agent(model, system_prompt, user_message, tools=None):
    response = client.messages.create(
        model=model.value,
        max_tokens=8192,
        system=system_prompt,
        messages=[{"role": "user", "content": user_message}],
        tools=tools or []
    )
    return response.content[0].text

Set your API key:

export ANTHROPIC_API_KEY="sk-ant-..."

Dispatch Options

The repo ships three orchestrator variants. Pick the one that fits your workflow.

Default — Synchronous Polling (orchestrator.py)

Sequential phase loop with a dispatch table. Simplest to debug.

python orchestrator.py --task "Refactor authentication module"

Option 1 — Event-Driven Dispatch (orchestrator_event_driven.py)

Each agent publishes a phase.done event; the orchestrator subscribes and triggers the next phase. Phases within a parallel group run with true asyncio.gather concurrency.

When to use: You want resilience, async execution, or plan to swap in a real event bus (Redis Pub/Sub, SQS, NATS).

python orchestrator_event_driven.py --task "Add caching layer"
python orchestrator_event_driven.py --task "..." --reset

Key differences from the default:

Feature Default Event-Driven
Execution model Synchronous while True loop Async event subscriptions
Parallel impl tasks Sequential (placeholder) asyncio.gather per group
Failure handling Exception propagates up PhaseEvent.status = "failed" with error context
Extensibility Add elif branches Subscribe new handlers to phase.done

The event bus is an in-process EventBus class. To go distributed, replace it with your broker of choice — the PhaseEvent dataclass stays the same.


Option 2 — Human-in-the-Loop Gates (orchestrator_hitl.py)

After the plan phase, the workflow pauses and surfaces a formatted plan summary for human approval. The reviewer can approve, request edits, or reject.

When to use: You want a safety net before burning tokens on implementation — especially useful for large or risky changes.

# Interactive — will prompt for approval
python orchestrator_hitl.py --task "Migrate database schema"

# Auto-approve (for CI or scripted runs)
python orchestrator_hitl.py --task "..." --auto-approve

# Reset and start fresh
python orchestrator_hitl.py --task "..." --reset

The approval gate shows:

  • Approach summary and risks
  • Every task with complexity badge, file scope, and acceptance criteria
  • Parallel execution order

Options at the gate:

Key Action
a Approve — lock the plan and proceed to implementation
e Edit — exit so you can modify plan.json by hand, then re-run
r Reject — abort the workflow entirely
v View — dump the full raw JSON for inspection

Extending the Pattern

Adding a new phase

  1. Add the phase to the Phase enum in orchestrator.py
  2. Create a system prompt in prompts/
  3. Define the output schema in schemas.py
  4. Write a run_<phase>_phase() handler
  5. Add it to the PHASE_HANDLERS dispatch table
  6. Update determine_next_phase() with the new state transition

Swapping the event bus (Option 1)

Replace the EventBus class in orchestrator_event_driven.py with your broker client. The PhaseEvent dataclass is the only contract — publish it as JSON to any topic/queue.

Customizing the approval gate (Option 2)

Edit _format_plan_summary() in orchestrator_hitl.py to change what the human sees. Add new options to approve_plan() if you want finer-grained control (e.g., approve individual tasks).


License

MIT

About

A pattern for splitting AI agent responsibilities across different models

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages