Cross-installation agent peering with built-in cost limits, circuit breaker, signed envelopes, and (as of alpha.14) opt-in WireGuard mesh layer governed by federation trust.
This guide walks through what federation is, when to use it, and how to set it up. For the architectural backstory and per-phase release notes see the companion gist.
Federation lets two or more Ruflo installations — your mac, a server, a teammate's laptop — discover each other, exchange signed manifests, and send messages between them with bounded cost and per-peer trust gates. Key properties:
- Ed25519 identity — each node holds a private key; peers exchange Ed25519-signed manifests. No central directory.
- Five-level trust ladder —
UNTRUSTED → VERIFIED → ATTESTED → TRUSTED → PRIVILEGED. Each level unlocks a wider set of operations (discovery,send,share-context,remote-spawn, …). - Per-peer budget + circuit breaker — bounded tokens/USD per peer. Sustained failures auto-SUSPEND; further failures EVICT. ADR-097.
- PII pipeline + audit trail — every cross-peer envelope passes through PII detection. Every state transition is auditable.
- Real wire transport — WSS with permessage-deflate compression, optional cert pinning, stream multiplexing. ADR-104.
- Optional WG mesh layer — opt-in opaque packet-layer reachability that follows federation trust changes. Compromised peer auto-isolated at L3 when the breaker fires. ADR-111.
| Use case | Fit |
|---|---|
| Two laptops collaborating on a project, want bounded cost sharing + audit | ✅ |
| Personal home server agent ↔ travel laptop | ✅ |
| Team of 5 engineers sharing memory/skills across machines | ✅ |
| Mobile / Windows / sub-50 peers with NAT issues | ✅ over Tailscale + federation |
| Public-internet exposed agent endpoints | ✅ with TLS cert pinning (ADR-107) |
| Internal HR/finance multi-agent workflow with strict access tiers | ✅ with PRIVILEGED gating + audit |
| Replacing Slack/Discord — no | ❌ federation is for agent-to-agent, not human chat |
| Untrusted-internet messaging without identity vetting | ❌ — trust ladder must be bootstrapped out-of-band |
npx ruflo@latest # if you don't have it yet
npx ruflo plugins install @claude-flow/plugin-agent-federationOr directly via npm:
npm i @claude-flow/plugin-agent-federation@latest # currently 1.0.0-alpha.14npx claude-flow@v3alpha agent spawn -t federation --name fed-1Or via the MCP tool federation_init:
{
"tool": "federation_init",
"params": {
"nodeId": "my-mac",
"endpoint": "ws://my-mac.tailnet:9100",
"agentTypes": ["coder", "tester"]
}
}This generates an Ed25519 keypair (persisted to .claude-flow/federation/keys-<nodeId>.json, mode 0600), publishes a signed manifest, and starts the discovery service.
{
"tool": "federation_join",
"params": {
"endpoint": "ws://other-host.tailnet:9100"
}
}The handshake exchanges manifests, verifies Ed25519 signatures, and establishes the initial trust level (UNTRUSTED until the trust-evaluator records enough successful interactions to promote).
{
"tool": "federation_send",
"params": {
"targetNodeId": "other-host",
"messageType": "agent-handoff",
"payload": { "task": "Investigate the failing integration test" }
}
}Budget and trust gates apply: if the peer is below ATTESTED, only discovery/status/ping go through. If the breaker has the peer SUSPENDED, the send short-circuits with PEER_SUSPENDED.
| Tool | What it does | Trust gate |
|---|---|---|
federation_init |
Initialize this node | — |
federation_join |
Join a peer by endpoint | — |
federation_peers |
List discovered peers | — |
federation_send |
Send a typed message to a peer | per-peer (varies) |
federation_query |
Synchronous query → response | ATTESTED+ |
federation_status |
Current node + peer trust summary | — |
federation_trust |
View / adjust trust levels | operator |
federation_audit |
Read audit log | operator |
federation_breaker_status |
Per-peer state, when changed, why | — |
federation_evict |
Operator manual evict | operator |
federation_reactivate |
Operator manual reactivate | operator |
federation_report_spend |
Report cost of a completed call | integrator |
federation_consensus |
Federated proposal across peers | varies |
federation_wg_status |
(ADR-111) Per-peer mesh state | — |
federation_wg_attest |
(ADR-111) Operator-signed witness entry | operator |
federation_wg_keyrotate |
(ADR-111) Rotate WG keypair | operator + confirm:true |
| Level | Capabilities (federation) | WG reachability (if ADR-111 active) |
|---|---|---|
UNTRUSTED |
discovery |
Excluded from mesh — drop all |
VERIFIED |
+ status, ping |
Discovery port (9100) only |
ATTESTED |
+ send, receive, query-redacted |
+ federation messaging (9101-9199) |
TRUSTED |
+ share-context, collaborative-task |
+ ssh (22), services (80/443) |
PRIVILEGED |
+ full-memory, remote-spawn |
Full mesh |
Trust is earned via repeated successful interactions (the TrustEvaluator tracks score + interaction count). Promotion thresholds are documented in domain/entities/trust-level.ts.
If a peer's failure ratio or cost spend exceeds the policy:
ACTIVE → SUSPENDED → EVICTED
↑ │
└─────────operator-only─────────┘
- SUSPEND: peer's outbound sends short-circuit (
PEER_SUSPENDED). Existing sessions continue; new sends rejected. Auto-eviction on continued failures. - EVICT: peer's outbound sends short-circuit (
PEER_EVICTED). Session terminated. Operator must explicitly reactivate. - The breaker does not auto-reactivate. The integrator's health probe is responsible for confirming the peer is healthy and calling
federation_reactivate.
Policy is tunable; defaults are conservative (50+ samples needed before suspend, high failure ratio threshold).
Federation today treats network connectivity as the integrator's problem (Tailscale, LAN, wss://+pinning). That works but trust changes don't propagate to the L3 layer — a peer EVICTED in federation stays in the tailnet until an admin manually removes it.
ADR-111 closes that gap with an optional in-tree WG mesh:
- WG keypair generated alongside the federation key
- Mesh IP derived deterministically from
nodeId(sha256 →10.50.0.0/16host portion, with collision-handling probe loop) - WG identity published inside the same Ed25519-signed manifest
- Federation breaker SUSPEND →
wg set ... allowed-ips ""(soft-block at L3) - Federation breaker EVICT →
wg set ... remove(terminal) - Operator reactivate → AllowedIPs restored
- Each mutation entered into an append-only Ed25519-signed witness chain
Phases shipped in alpha.14:
- Phase 1 — Manifest extension + key generation
- Phase 2 —
WgMeshService(no shell — emits configs + commands) - Phase 3 — Coordinator/breaker wiring
- Phase 4 — Firewall projection (
nftables/pf) — PR #1895 - Phase 5 — Witness attestation chain — PR #1895
- Phase 6 — Operator MCP tools — PR #1895
Phase 7 — operator-mediated:
See docs/federation/phase7-mesh-bringup.md for the cross-OS bringup procedure (mac ↔ ruvultra over Tailscale).
Set config.wgMesh: true in your federation plugin config, then run the staging helper:
node v3/@claude-flow/plugin-agent-federation/scripts/phase7-stage.mjs \
<localNodeId> <peerNodeId> <peerPubkey> <peerMeshIP> <peerEndpoint>The script generates /tmp/adr-111-stage/:
wg-key-<nodeId>.json(mode 0600 — your private WG key)ruflo-fed.conf(the wg-quick interface config)ruflo-fed.nftorruflo-fed.pf(firewall projection)
After review, the operator manually activates:
sudo install -m 0600 /tmp/adr-111-stage/ruflo-fed.conf /etc/wireguard/ruflo-fed.conf
# Linux:
sudo nft -f /tmp/adr-111-stage/ruflo-fed.nft
# macOS:
sudo pfctl -a ruflo-fed -f /tmp/adr-111-stage/ruflo-fed.pf
sudo wg-quick up ruflo-fedclaude -p (print/pipe mode) can drive federation MCP tools non-interactively. Each invocation processes its prompt and exits — for a persistent federation listener, run a long-lived MCP server instead.
On the originating host (mac mini):
claude -p --model haiku --max-budget-usd 0.20 --output-format text \
"Federation MCP tools to verify cross-machine peering health: name 3. One line."
# → Three Federation MCP tools for peering health verification:
# federation-init (keypair generation), federation-status (peers/trust/metrics),
# federation-audit (compliance filtering).On the peer host (ruvultra) — requires interactive /login first since claude -p reads stored credentials:
# First time only — operator-mediated:
claude # then /login
# After that:
claude -p --model haiku --max-budget-usd 0.20 \
"Run federation_status MCP and report peer count."Workflow for cross-machine task handoff:
# Host A — kick off a federated task:
claude -p --model sonnet --output-format json --resume <session> \
"Use federation_send to dispatch this task to ruvultra: analyze the failing test"
# Host B (ruvultra) — receive + work:
claude -p --resume <session> "Continue handling the federated task"The federation plugin handles signing, PII gating, breaker, and audit on every send. The two claude -p invocations don't share state directly — they communicate exclusively through the federation envelope channel.
- Replacement for Slack/Discord. Federation moves agent envelopes, not human chat.
- Public internet without identity vetting. Trust ladder bootstrapping is your responsibility.
- NAT traversal magic. Use Tailscale (or Headscale) for that; federation rides on top.
- A general-purpose RPC framework. It's specifically for AI agents with cost-aware budgeting.
| What | Path |
|---|---|
| Plugin source | v3/@claude-flow/plugin-agent-federation/src/ |
| Tests | v3/@claude-flow/plugin-agent-federation/__tests__/ |
| ADRs | v3/docs/adr/ADR-{097,104,105,106,107,109,110,111}-*.md |
| Phase 7 staging script | v3/@claude-flow/plugin-agent-federation/scripts/phase7-stage.mjs |
| Witness signing | plugins/ruflo-core/scripts/witness/ |
| Version | What landed |
|---|---|
1.0.0-alpha.9 |
First user-visible release — see announcement gist |
1.0.0-alpha.10 |
ADR-097 Phases 2.a-4 + ADR-104 transport + ADR-109 inbound dispatcher |
1.0.0-alpha.11-12 |
ADR-109 sig verify, ADR-104 compression, ADR-107 TLS cert pinning |
1.0.0-alpha.13 |
ADR-104 stream multiplexing + ADR-110 MemorySpendReporter |
1.0.0-alpha.14 |
ADR-111 Phases 1-3 (WG mesh foundation) |
1.0.0-alpha.15 (in flight) |
ADR-111 Phases 4-6 (firewall + witness + MCP tools) — PR #1895 |
- ADR-097 — budget + circuit breaker
- ADR-104 — WSS transport + multiplexing
- ADR-105 — state snapshot/replay
- ADR-106 — discovery mechanisms
- ADR-107 — TLS + cert pinning
- ADR-109 — inbound dispatch + sig verify
- ADR-110 — production SpendReporter
- ADR-111 — WG mesh layer
- Issues: https://github.com/ruvnet/ruflo/issues
- Tracking issue (ADR-111): #1879
- Federation gist (current through alpha.14): https://gist.github.com/ruvnet/3b5111a2ea7e450ff262ce96e88560bf
- ADR-111 deep-dive gist: https://gist.github.com/ruvnet/c640fc71c7a6ced37908e645d5db84c5