Skip to content

feat(pilotctl): agent-first CLI overhaul — bounded output, filters, styling#247

Merged
TeoSlayer merged 6 commits into
mainfrom
feat/pilotctl-agent-first-cli
Jun 16, 2026
Merged

feat(pilotctl): agent-first CLI overhaul — bounded output, filters, styling#247
TeoSlayer merged 6 commits into
mainfrom
feat/pilotctl-agent-first-cli

Conversation

@TeoSlayer

Copy link
Copy Markdown
Owner

Summary

  • Agents can finally use the CLI without shell-grubbing. pilotctl --json inbox went from a 23 MB all-messages dump to a bounded 3 KB reply; --latest replaces the jq -r '.data' "$(ls -1t ~/.pilot/inbox/*.json | head -1)" workaround the SKILL.md used to teach.
  • Bounded + filterable everywhere: inbox and received get --limit/--since/--latest/read <id>/--clear --before; trust gets --limit/--search (newest-first, one-way trust flagged); peers shows a summary + exceptions-only view (immediately surfaced 3 unencrypted peers invisible in the old 678-row noise table), with colorized --all.
  • Real bug fixed: daemon status no longer reports stopped (pid 0) from a stale PID file while printing live socket data — the socket is now the source of truth, with a dim "pid file stale" note.
  • Faster failure + hints: ping defaults 5s (was 30s; total failure ~10s instead of ~40s) and prints a relay-convergence hint; handshake/approve/untrust outcome messages carry the next command to run.
  • Silent waits animated: send-message --wait, ping, bench, traceroute show a self-erasing waiting… Ns line on stderr — TTY-only, hard no-op for pipes and --json, race-detector clean.
  • New style layer (cmd/pilotctl/style.go): semantic ANSI (statusDot, bold/dim/accent) gated on TTY + NO_COLOR/PILOT_NO_COLOR/TERM=dumb. Piped output stays grep-safe; --json is byte-stable except documented additions ("to" resolved-address on send-message/ping, "total"/"shown" counters).
  • Plus: config aligned key-value (no more raw JSON at humans), skills status per-tool status dots with --verbose detail, updates word-boundary wrap, network list drops the always-empty MEMBERS column with an "admin only" footnote, context <command> returns one command's spec (18 KB → ~440 B).

Test plan

  • go test ./cmd/pilotctl/ -count=1 green (presentation assertions updated to new contract; all plain-text, no ANSI — tests pipe stdout so color is off)
  • New tests: wait-progress lifecycle (incl. -race), received --limit/--since/--clear --before, context <command> found/not-found, send-message/ping JSON "to" field
  • gofmt + go vet clean; pre-commit hooks pass
  • Live-verified against a running daemon (678 peers): inbox/peers/trust/info/daemon-status/skills-status/config/updates/network-list rendering via pseudo-TTY, JSON shapes via pipe
  • Reviewer: eyeball script -q /dev/null pilotctl <cmd> rendering on your terminal theme

🤖 Generated with Claude Code

@TeoSlayer TeoSlayer requested a review from Alexgodoroja as a code owner June 11, 2026 10:13
teovl and others added 6 commits June 16, 2026 11:25
…tyling

Inbox was unusable for agents (23 MB --json dumps, oldest-first, 80-char
mid-token truncation) and several commands shared the same disease. This
makes every high-traffic command bounded, filterable, non-interactive,
and visually scannable, without breaking --json consumers.

- inbox: newest-first, default --limit 10, --latest/--from/--since/
  --full/read <id>/--clear --before; --json bounded (23 MB -> 3 KB)
- received: same flag surface ported (mtime-ordered; sender metadata
  unavailable in dataexchange filenames)
- peers: summary + exceptions-only view (surfaces unencrypted peers),
  colorized --all, --limit/--search
- trust: newest-first, --limit 20, --search, one-way trust flagged
- daemon status: fix contradiction (stale PID file reported "stopped"
  while live socket data printed); socket is now the source of truth
- info: grouped identity/network/traffic/skills layout
- ping: 5s default timeout (was 30s), relay-convergence hint on failure
- send-message/ping --json: add "to" resolved-address field
- send-message --wait, ping, bench, traceroute: animated elapsed line on
  stderr (TTY-only, erased on completion, no-op for pipes/--json)
- config/skills status/updates/network list: aligned key-value, per-tool
  status dots, word-boundary wrap, dead MEMBERS column dropped
- handshake/approve/untrust: next-step hints
- context <command>: single-command spec (18 KB -> ~440 B)

New style layer (cmd/pilotctl/style.go): semantic ANSI helpers gated on
TTY + NO_COLOR/PILOT_NO_COLOR/TERM=dumb; tests pipe stdout so assertions
stay plain-text.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
received gained --limit, --since, --clear --before in the CLI overhaul.
The cli-reference-check gate caught the stale summary line.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fixes the "send-file hangs ~120s then EOFs" report originally filed in
BUG-updater-version-skew.md. Two distinct things ship here:

1. cmdSendFile reliability (M0 of the reliable-file-transfer proposal):
   - --timeout flag (default 90s) bounds the ACK wait. Was unbounded
     before, hitting SO_KEEPALIVE around 120s with an opaque EOF.
     On expiry we close the conn (unblocks the read goroutine), then
     surface a clear hint pointing at pilotctl ping <peer>.
   - Progress line on stderr via startWaitProgress, gated on TTY +
     not --json so agents don't see control chars.
   - Result JSON now carries elapsed_ms and throughput_mbps.
   - Receiver "ERR …" ACK already errored; tightened the hint.
   - parseFlags split for --timeout: positional args come from `pos`.

2. Docs:
   - BUG-updater-version-skew.md rewritten end-to-end. The RSS-stale
     mechanism the original claimed is wrong (updater hits the GitHub
     API directly per updater.go:247); the real cause is that
     pilot-updater ships but is never started (no launchd/systemd unit,
     not embedded in daemon). Also documents the missing pilot-gateway
     binary in v1.11.0 (confirmed by `tar tzf`).
   - PROPOSAL-reliable-file-transfer.md is the new home for the
     real fix — TypeFileStream wire type, INIT/CHUNK/ACK/DONE/ABORT/RESUME
     state machine, sliding-window backpressure, end-to-end SHA-256,
     resume protocol, backward-compat fallback to TypeFile. Six
     milestones; M0 ships in this commit.

No wire-format change. No new deps. Full pilotctl suite green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When two peers are both behind NAT (e.g. Mac home-NAT ↔ GCP VM stateful
conntrack), the direct PILA key-exchange frame never lands, and the
tunnel only reconverges after slow blackhole detection flips the peer to
relay mode — measured 28s–3min on the canonical Mac↔VM rig, far longer
than the dial/send timeouts, so send-file/send-message time out and the
crypto state desyncs.

sendKeyExchangeToNode now ALSO pushes the key-exchange via the beacon
relay whenever the peer is not yet relay-flagged and a beacon is
available. The relay copy converges in ~1 RTT. It is a no-op once the
peer is relay-flagged (the primary send already went via relay), and
relayProbeLoop keeps probing direct so a genuine direct path still
upgrades the peer out of relay. Best-effort: a failed relay copy falls
back to the existing slow path.

Adds routing.SendRelayFrame (forced-relay send primitive, ignores the
per-peer relay flag and blackhole heuristic) and the ClearRekeyGaveUp /
ClearLastRekeyReq rekey-state shims.

Verified on the canonical Mac↔VM dual-NAT rig:
- G2 liveness: idle 5min (and 90min) then small msg arrives, no reset.
- Small msg ACK in ~0.42s (was 28s–3min).
- 64KB send-file byte-perfect (sha256 match), incl. from a cold daemon
  restart (fresh in-memory peer table) — tunnel re-converges in ~12s.
- No regressions: 0 panics, 0 relay-copy failures on either end.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… prefer-direct

Make P2P transfers actually go direct (and stay direct) across NAT, and
ship large files reliably.

- daemon.go: rewrite relayProbeLoop → tryDirectUpgrade. The old loop sent
  a one-way SendDirectProbe every 5 min, which a stateful NAT/firewall
  always drops (no conntrack pinhole). Now it fires a beacon-coordinated
  RequestHolePunch to open the pinhole on both NATs, then pushes encrypted
  probes at the peer's REAL address so the peer's ClearRelayOnDirect
  promotes the path. Unpins blackhole-pinned (non-relay-only) peers,
  resolves fresh when uncached, and runs every 15 s (was 5 min).
- tunnel.go: add SendDirectProbeTo — encrypted probe to an explicit real
  address (the upgrade primitive; the stored peers[] entry for a relay
  peer is the beacon placeholder).
- ipc.go: handlePreferDirect — drop tunnel + cached resolution so the next
  dial re-runs resolve + punch; unpin relay.
- pilotctl/main.go: send-file streams by default (TypeFileStream) and
  falls back to single-frame TypeFile when the peer never sends an
  INIT-ACK (back-compat); --no-stream forces legacy; reports
  transport/sha256/throughput. Adds `prefer-direct` command +
  --prefer-direct/--timeout flags.

Verified on Mac↔GCP-VM (true dual-NAT) and a fresh throwaway VM: tunnel
goes relay=False via hole-punch in ~8 s through a default-deny firewall,
holds direct through a 50 MB transfer (no flip), byte-perfect sha256,
~7-15× the relay throughput. Survives a cold restart of both ends.

go.mod points common/dataexchange at branch commits (pseudo-versions)
pending their tagged releases; the version-bump to proper tags happens at
release time (v1.11.1), which is intentionally held for review.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…iew)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@TeoSlayer TeoSlayer force-pushed the feat/pilotctl-agent-first-cli branch from 073da91 to 3909733 Compare June 16, 2026 08:27
@TeoSlayer TeoSlayer merged commit e4e1a5b into main Jun 16, 2026
9 checks passed
@TeoSlayer TeoSlayer deleted the feat/pilotctl-agent-first-cli branch June 16, 2026 08:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants