Skip to content

Make the monitor mode and delivery.sh set work under Claude Code's sandboxed Bash tool#106

Merged
fujibee merged 3 commits into
fujibee:mainfrom
tatsuya6502:cc-sandbox
Jun 14, 2026
Merged

Make the monitor mode and delivery.sh set work under Claude Code's sandboxed Bash tool#106
fujibee merged 3 commits into
fujibee:mainfrom
tatsuya6502:cc-sandbox

Conversation

@tatsuya6502

Copy link
Copy Markdown
Contributor

Thank you for creating agmsg! This PR fixes two issues that arise when using it with Claude Code's sandbox enabled.

Note that I used LLMs to investigate and fix these issues, as well as writing this PR description. The models I used were the followings:

  • GLM-5.1 and GLM-5.2 from Z.ai (for investigation and fixing)
  • Claude Opus 4.8 from Anthropic (for testing on a Linux VM)

I used Claude Code as the harness.

All code changes were reviewed and tested by me.

Summary

When Claude Code (CC) runs its Bash tool with the sandbox enabled (macOS sandbox-exec / Seatbelt; Linux seccomp-bpf), two categories of agmsg operations break:

  • monitor deliverywatch.sh (launched via the Monitor tool) runs inside the sandbox and can't resolve its DB path, validate a prior watcher, or (without a one-line user config) write its pidfile.
  • delivery.sh set <mode> — invoked from a sandboxed Bash tool (e.g. via /agmsg), it aborts before writing settings.local.json.

This PR fixes both, and documents the single required user-side setting.

Why it breaks

Context Sandboxed? Scripts that run here
Hooks (SessionStart / Stop / SessionEnd) No (outside) session-start.sh, session-end.sh, check-inbox.sh
Monitor tool (persistent bash) Yes (inside) watch.sh
Bash tool (ad-hoc, incl. /agmsg) Yes (inside) delivery.sh set …, agmsg send …

The sandbox allows reads anywhere, writes only inside an allowlist, and blocks ps. (It also blocks kill -0 for arbitrary PIDs — see Known limitations.) Only the "inside sandbox" rows are affected.

Changes

1. scripts/lib/storage.sh — DB path when BASH_SOURCE[0] is empty

CC's Bash tool runs commands via pipe/eval, so BASH_SOURCE[0] is empty inside sourced functions. agmsg_storage_dir() used it to derive the DB path and fell back to CWD, resolving messages.db under the project dir instead of under the skill.

Fix: add a SKILL_DIR env-var fallback. Every script that sources storage.sh already resolves SKILL_DIR from $0 (which is populated when the script is invoked as a command), so this needs no caller changes. Resolution order is now AGMSG_STORAGE_PATHBASH_SOURCE[0]SKILL_DIR → error.

2. scripts/watch.sh — graceful ps fallback

watch.sh validates a prior watcher's cmdline via ps -o args= before killing it (pid-recycling defense, #66). Under the sandbox ps is blocked, which crashed the branch. Now it treats empty ps output as "ps unavailable" and proceeds without the cmdline check.

⚠️ Caveat (see Known limitations): the gate guarding this block is kill -0 "$prev_pid", which is also blocked under the sandbox — so in practice the whole dedup is skipped and a superseded watcher can be orphaned (minor). Documented, not fixed here.

3. scripts/delivery.sh + scripts/release/sync-version.sh — anchor mktemp to $TMPDIR

delivery.sh set … runs under set -euo pipefail. Its settings-injection helpers (strip_agmsg_event_file, add_event_entry_file, prune_empty_hooks_file, apply_settings) each call bare mktemp. On macOS, BSD mktemp with no template resolves the temp dir via confstr(_CS_DARWIN_USER_TEMP_DIR) to /var/folders/.../T/ and ignores the $TMPDIR env var. CC's sandbox blocks that path, so:

$ mktemp
mktemp: mkstemp failed on /var/folders/1j/…/T/tmp.IoANuLvaKm: Operation not permitted

That aborts the entire set before it ever reaches the settings.local.json write. (The hooks file itself is innocent — it lives inside the project and is writable.)

Fix: anchor every mktemp to the sandbox temp dir. GNU mktemp (Linux) already honors $TMPDIR, so the explicit form is portable and strictly safer there too.

tmp=$(mktemp)                                    # before
tmp=$(mktemp "${TMPDIR:-/tmp}/agmsg.XXXXXX")     # after

4. README.md + SKILL.md — document the required allowWrite setting

Monitor mode needs to write pidfiles and SQLite WAL/SHM under ~/.agents/skills/agmsg/, which is outside the project dir. Users with the sandbox enabled must allowlist it. Both docs now describe this and the BASH_SOURCE handling.

User-facing requirement

Users with Claude Code's sandbox enabled need to add (any settings scope):

{
  "sandbox": {
    "filesystem": {
      "allowWrite": ["~/.agents/skills/agmsg/"]
    }
  }
}

Future improvement (not in this PR): delivery.sh set monitor could auto-inject this into .claude/settings.local.json so no manual config is needed.

Testing

Platform Sandbox Result
macOS (Apple Silicon) sandbox-exec (Seatbelt) monitor delivery end-to-end; delivery.sh set monitor|turn|off all write settings.local.json
Linux VM seccomp-bpf monitor delivery verified end-to-end

Known limitations

There are two minor issues remaining under the sandbox.

  • delivery.sh status misreports live watchers as dead. do_status classifies pidfiles with kill -0 "$pid", which the sandbox blocks for any pid ≠ $$. A genuinely-running watcher is reported 0 alive, 1 stale. Same mechanism as the ps block; reproduces on macOS and Linux.
  • watch.sh prior-watcher dedup is skipped under the sandbox (the kill -0 gate in change Hook UI message semantics (Claude Code and Codex) #2), so a superseded watch.sh can be orphaned until it exits on its own. Minor; the new watcher still takes over via the pidfile.

Both stem from kill -0/ps being unavailable in-sandbox. The right fix is a design choice — a heartbeat file (watch.sh touches run/watch.<sid>.heartbeat each poll cycle; liveness checks read its mtime) is the leading candidate, vs. degrading gracefully ("N pidfiles present, liveness unverifiable"). Left for a follow-up PR because it's a behavior decision best made by the maintainer. Happy to open a separate PR for it.

tatsuya6502 and others added 3 commits June 12, 2026 08:07
When Claude Code's sandbox is enabled, monitor mode fails because:

- Filesystem writes are restricted to the project directory, blocking
  pidfile and SQLite WAL writes under ~/.agents/skills/agmsg/
- BASH_SOURCE[0] is empty inside sourced functions (pipe/eval context),
  breaking storage.sh path resolution
- ps is blocked, causing watch.sh to crash on cmdline validation

Fixes:
- storage.sh: fall back to SKILL_DIR (set by caller from $0) when
  BASH_SOURCE[0] is empty
- watch.sh: handle empty ps output gracefully, skip cmdline validation
  when ps is unavailable
- README.md + SKILL.md: document allowWrite sandbox config for users
The settings-injection helpers in delivery.sh (strip_agmsg_event_file,
add_event_entry_file, prune_empty_hooks_file, apply_settings) and the
release helper release/sync-version.sh all call bare `mktemp`. On macOS,
BSD mktemp with no template resolves the temp dir via
confstr(_CS_DARWIN_USER_TEMP_DIR) to /var/folders/.../T/ and ignores the
$TMPDIR env var. Under Claude Code's sandbox that path is not writable, so
mktemp fails with "Operation not permitted"; combined with `set -euo pipefail`,
this aborted `delivery.sh set` before it ever wrote settings.local.json.

Anchor every mktemp to ${TMPDIR:-/tmp} so the temp file lands in the
sandbox-writable temp dir. GNU mktemp (Linux) already honors $TMPDIR, so the
explicit form is portable and strictly safer there too.

Verified under the macOS sandbox: `delivery.sh set monitor|turn|off` now
complete and write settings.local.json.
…d template

The sandbox compatibility section added in 0d9b2a3 only landed in the
repo-root SKILL.md (the marketplace-plugin skill source). install.sh
generates the installed Claude Code command from templates/cmd.claude-code.md
and never reads root SKILL.md, so ./install.sh users never saw the
allowWrite guidance.

Port the section into cmd.claude-code.md, adapting paths to the
__SKILL_NAME__ placeholder so custom command names resolve correctly.
@fujibee

fujibee commented Jun 14, 2026

Copy link
Copy Markdown
Owner

Thanks for the thorough writeup and fix, @tatsuya6502 — the sandbox breakdown (hooks run outside, Monitor/Bash-tool run inside) is accurate, and the storage.sh SKILL_DIR fallback plus the mktemp$TMPDIR anchoring address real failures. Merging this.

One trivial, non-blocking follow-up: in scripts/lib/storage.sh the header comment lists the resolution order as AGMSG_STORAGE_PATH → SKILL_DIR → BASH_SOURCE, but the code checks BASH_SOURCE[0] first and only falls back to SKILL_DIR (if BASH_SOURCE … elif SKILL_DIR). Worth aligning the comment's 2/3 ordering with the code so the documented order isn't inverted.

Heads-up that we tend to merge contributions quickly, so follow-ups are welcome as separate PRs. Thanks again!

@fujibee fujibee merged commit f4a721e into fujibee:main Jun 14, 2026
fujibee added a commit that referenced this pull request Jun 14, 2026
…114 review)

Addresses aggie-co's review: busy_timeout fixes concurrent writes to an
existing DB, but a leader fanning out as the FIRST write to a fresh/override
store (the #106 sandbox case) still dropped — init-db.sh raced its own
initialization.

- init-db.sh: run idempotently and unconditionally (drop the file-existence
  guard) with CREATE TABLE/INDEX IF NOT EXISTS, so concurrent initializers
  no-op instead of aborting on "already exists". Set journal_mode=WAL
  best-effort (the mode change wants exclusive access and can hit "database is
  locked" on a brand-new DB even with a busy_timeout; it's an optimization, and
  whichever initializer wins makes it stick).
- send.sh: retry the INSERT once after re-running init when the first attempt
  fails — covers the window where a process sees the DB file before the winning
  initializer has committed the schema ("no such table").

Test: a 10-way concurrent fan-out to a FRESH (uninitialized) override store all
lands (was 0/10, then 8-9/10 with IF NOT EXISTS alone, now 10/10). Full suite
green (236).
fujibee added a commit that referenced this pull request Jun 15, 2026
…no longer drop (SQLITE_BUSY) (#115)

* fix(storage): busy_timeout on all DB connections to survive concurrent writes (#114)

send.sh (and every other sqlite3 caller) opened the DB with the default
busy_timeout=0, so concurrent writers failed immediately with SQLITE_BUSY
instead of waiting. A leader fanning a job out to N members hit this every
time — only one write landed, the rest errored, and since send.sh just exits
non-zero the dropped messages were silently lost (they never reached the DB).
WAL allows readers + one writer but does not let writers run concurrently.

Add agmsg_sqlite() to lib/storage.sh — sqlite3 with a `.timeout` so a writer
waits for the lock instead of failing — and route every DB-backed call site
(send/inbox/check-inbox/history/rename/rename-team/watch/init-db) through it.
In-memory JSON parsing (sqlite3 :memory:) is untouched (no file lock).

Uses the `.timeout` dot-command, not `PRAGMA busy_timeout=N`: the PRAGMA
returns its value as a row that sqlite3 would print, corrupting every SELECT
(and the watch stream). `.timeout` sets the same timeout silently.

Tests: a 10-way concurrent send fan-out all lands (no SQLITE_BUSY); agmsg_sqlite
output is not polluted. Full suite green (235).

* fix(storage): make first-write init race-safe for concurrent fan-out (#114 review)

Addresses aggie-co's review: busy_timeout fixes concurrent writes to an
existing DB, but a leader fanning out as the FIRST write to a fresh/override
store (the #106 sandbox case) still dropped — init-db.sh raced its own
initialization.

- init-db.sh: run idempotently and unconditionally (drop the file-existence
  guard) with CREATE TABLE/INDEX IF NOT EXISTS, so concurrent initializers
  no-op instead of aborting on "already exists". Set journal_mode=WAL
  best-effort (the mode change wants exclusive access and can hit "database is
  locked" on a brand-new DB even with a busy_timeout; it's an optimization, and
  whichever initializer wins makes it stick).
- send.sh: retry the INSERT once after re-running init when the first attempt
  fails — covers the window where a process sees the DB file before the winning
  initializer has committed the schema ("no such table").

Test: a 10-way concurrent fan-out to a FRESH (uninitialized) override store all
lands (was 0/10, then 8-9/10 with IF NOT EXISTS alone, now 10/10). Full suite
green (236).
fujibee added a commit that referenced this pull request Jun 15, 2026
- Direct script section now shows the `git clone` + `cd` it assumed, and notes
  it's the path that tracks latest main.
- Community: credit @lucianlamp (native Windows PowerShell helpers, #103) and
  @tatsuya6502 (sandboxed Bash tool support, #106) — merged but uncredited.
fujibee added a commit that referenced this pull request Jun 15, 2026
* docs(readme): document npm and Claude Code plugin install paths

Adds two install paths the README didn't cover after the PH-launch
rework:

- **npm / npx** — published since #89 via npm Trusted Publisher (OIDC)
  with SLSA provenance. `npx agmsg` is the lowest-friction path for
  Node-having users.
- **Claude Code plugin marketplace** — `/plugin marketplace add fujibee/agmsg`
  + `/plugin install agmsg@fujibee-agmsg` + `/reload-plugins` + `/agmsg`.
  Verified end-to-end against a fresh Debian-based Claude Code container
  today: the in-CC slash command flow runs the SKILL.md Step 0 bootstrap
  (added in #85) and lands on the same `~/.agents/skills/agmsg/`
  runtime as the direct-script install.

Also surfaces the `bash + sqlite3` prerequisite at the top of Quick Start.
The dogfood revealed that minimal Linux images (Debian slim, etc.) don't
include sqlite3 by default; the bootstrap installer surfaces a clear
error, but it's worth flagging up front. macOS users are unaffected.

The Install section is restructured into subsections (npm, plugin
marketplace, direct script) so each path stands on its own. A note in
the direct-script subsection clarifies that `--cmd` / `--agent-type`
flags are direct-script only — the other paths always install as `agmsg`
with auto-detected agent type.

* docs(readme): note which install path tracks main vs tagged releases

git clone / setup.sh install from main (always current); the npm package and
Claude Code plugin are cut from tagged releases and can lag. Point readers at
`/agmsg version` to see exactly what they're running (#117 provenance).

* docs(readme): add clone step to Direct script; credit new contributors

- Direct script section now shows the `git clone` + `cd` it assumed, and notes
  it's the path that tracks latest main.
- Community: credit @lucianlamp (native Windows PowerShell helpers, #103) and
  @tatsuya6502 (sandboxed Bash tool support, #106) — merged but uncredited.
fujibee added a commit that referenced this pull request Jun 15, 2026
Wire up changelog generation and fold it into the release pipeline so a single
command does packaging + changelog.

- cliff.toml: git-cliff config, Keep a Changelog output from Conventional
  Commits. Non-conventional squash subjects are dropped except two headline
  community PRs (native Windows #103, sandbox #106) captured by subject.
- CHANGELOG.md: bootstrapped from history; everything since v1.0.3 sits under
  [Unreleased] until the next release promotes it.
- scripts/release/cut-release.sh <version>: bump VERSION + sync derived files +
  regenerate CHANGELOG.md + open a release PR + auto-merge on green + tag. PR-
  based because main now requires status checks (direct push is rejected).
- release.yml: generate the GitHub Release notes for the tag with git-cliff
  (replaces the bare "see commit history"); checkout fetch-depth: 0 for it.
- package.json: ship CHANGELOG.md in the npm tarball.
- RELEASING.md: document the new one-shot flow.

No VERSION bump here — this only adds the tooling.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants