Add mode:"mask" for env vars — sentinel in sandbox, real value injected at the proxy by elhajjj · Pull Request #318 · anthropic-experimental/sandbox-runtime

elhajjj · 2026-06-18T22:30:14Z

Problem

With mode: "deny" (#289), the sandboxed process can't read a credential at all — which means tools that need to authenticate (e.g. gh, aws) simply fail inside the sandbox. The useful primitive is: the sandbox sees a fake, and the trusted proxy substitutes the real value on egress to declared destinations — so tools work end-to-end without the sandbox ever holding the real secret.

What this adds

Env vars only; files come in a follow-up.

mode: "mask" for credentials.envVars. The sandbox sees fake_value_<uuid>; a per-session in-memory sentinel registry in SandboxManager holds the fake↔real map (never on disk, never logged).
Per-credential injectHosts (optional narrowing): the proxy only swaps a credential's sentinel for its real value when the destination matches that credential's injectHosts. If unset, defaults to network.allowedDomains — every reachable host. Each entry is validated as semantically reachable (wildcard-aware).
A request-mutation seam (MutateForwardedHeaders) threaded through the proxy; header injection runs on the TLS-terminated path only by default. The plaintext path passes the sentinel through unchanged (fails closed) unless credentials.allowPlaintextInject: true.
network.tlsTerminate is required when masking (or allowPlaintextInject as an explicit escape hatch) — substitution only happens where the proxy can see and re-encrypt the request.

Config example

{
  "network": {
    "allowedDomains": ["api.github.com", "registry.npmjs.org"],
    "deniedDomains": [],
    "tlsTerminate": {}
  },
  "filesystem": { "denyRead": [], "allowWrite": ["."], "denyWrite": [] },
  "credentials": {
    "envVars": [
      { "name": "GH_TOKEN",  "mode": "mask", "injectHosts": ["api.github.com"] },
      { "name": "NPM_TOKEN", "mode": "mask" }
    ]
  }
}

GH_TOKEN's sentinel is swapped only at api.github.com; NPM_TOKEN (no injectHosts) defaults to every allowedDomain. Sending one credential's sentinel to the other's host passes the fake through unchanged.

Transport safety

Substitution runs only after the allow decision and only on the TLS-terminated path (forwardUpstream); httpsRequest uses the default rejectUnauthorized: true, so a cert-verification failure means the substituted headers never leave the host. The SOCKS path and non-TLS CONNECT are opaque tunnels — never substituted. No proxy log line emits header values. The real value never appears in the wrapped command (visible to ps); only the sentinel does.

Builds on #289

Extends the credentials config block introduced in #289 (now merged). This branch was stacked on 188f9a1 (the head of #289), so the diff here is exactly the masking layer on top of mode: "deny".

Evidence it works

43 new tests: 29 in test/sandbox/credential-mask.test.ts (new file) + 14 in test/config-validation.test.ts. test/sandbox/credential-deny.test.ts is unchanged but kept green as a regression check.
Full local suite: 442 pass / 121 skip; the only 5 failures are the pre-existing apply-seccomp tests that need the precompiled BPF binary built locally — unrelated to this change.
npx tsc --noEmit and eslint on all changed files: clean.
Manually exercised end-to-end on Linux: curl through bwrap → srt proxy → local upstream receives the real value when the host matches injectHosts, and the sentinel when it doesn't (see "How to test" below for the exact run).

Notes for reviewers

New shared module src/sandbox/domain-pattern.ts — matchesDomainPattern moved out of sandbox-manager.ts so the schema validator can use the same wildcard semantics for the injectHosts ⊆ allowedDomains check.
The superRefine wrapper on SandboxRuntimeConfigSchema re-indented ~70 unchanged lines in sandbox-config.ts — view the diff with whitespace hidden.
updateConfig can't enable masking mid-session if initialize() ran without a credentials block (the proxy's mutateHeaders seam is bound at init, like other proxy options). Adding masking at runtime requires reset() + initialize().
The SandboxManager-level e2e tests use allowPlaintextInject because there's no test-CA seam at the manager level; the TLS path is fully tested at the createHttpProxyServer level (real curl → CONNECT → terminate → real HTTPS upstream).
Scoped wildcards (e.g. injectHosts: ['*.github.com']) are accepted — the credential goes to any matching subdomain. Per-credential, so it's a conscious per-token choice; flag if exact-only is preferred.
Files still reject mode: "mask" (follow-up); body-carried credentials aren't substituted (header values only).

How to test (Linux)

Drives the srt CLI directly: a real bwrap sandbox, the srt-managed HTTP proxy, and a tiny local upstream that records what it received. Uses allowPlaintextInject so a plain-HTTP upstream works without a test CA.

Setup

# Point this at your checkout of this branch and build it.
SRT_DIR=/path/to/sandbox-runtime
(cd "$SRT_DIR" && npm run build)
SRT="node $SRT_DIR/dist/cli.js"

WORK=$(mktemp -d)
LOG="$WORK/upstream.log"
: > "$LOG"

# Throwaway "real" credential values — never real secrets.
GH_REAL="real-gh-secret-$$-$RANDOM"
NPM_REAL="real-npm-secret-$$-$RANDOM"

# Two hostnames that both resolve to loopback on the HOST side (where the
# srt proxy runs and dials the upstream). They are distinct strings to the
# proxy's exact-match host gating, which is what injectHosts checks.
HOST_A=localhost
HOST_B=127.0.0.1

# Tiny upstream: logs "<tag>|<authorization>" per request.
cat > "$WORK/upstream.js" <<'EOF'
const http = require('http');
const fs = require('fs');
const log = process.argv[2];
const s = http.createServer((req, res) => {
  const tag = (req.url || '/').slice(1);
  const auth = req.headers['authorization'] || '';
  fs.appendFileSync(log, tag + '|' + auth + '\n');
  res.writeHead(200, {'content-type': 'text/plain'});
  res.end('ok\n');
});
s.listen(0, '127.0.0.1', () => {
  process.stdout.write(String(s.address().port) + '\n');
});
EOF

exec {UPFD}< <(node "$WORK/upstream.js" "$LOG")
UPSTREAM_PID=$!
read -r -u "$UPFD" PORT
trap 'kill "$UPSTREAM_PID" 2>/dev/null; rm -rf "$WORK"' EXIT

# Helpers
FS_BLOCK='"filesystem": { "denyRead": [], "allowWrite": ["."], "denyWrite": [] }'
write_cfg() {
  cat > "$1" <<EOF
{
  "network": { "allowedDomains": $2, "deniedDomains": [] },
  $FS_BLOCK,
  "credentials": $3
}
EOF
}
run_srt() {
  OUT=$(GH_TOKEN="$GH_REAL" NPM_TOKEN="$NPM_REAL" $SRT --settings "$1" -c "$2" 2>&1)
  RC=$?
}
# curl helper run INSIDE the sandbox. The bwrap child gets HTTP_PROXY set
# but also NO_PROXY=localhost,127.0.0.1,… — and curl honours no_proxy even
# when --proxy is passed. Clearing both forces loopback URLs through srt's
# proxy (the only egress path under --unshare-net).
incurl() {
  printf 'no_proxy= NO_PROXY= curl -sS --max-time 10 --proxy "$http_proxy" -H "Authorization: Bearer $%s" "http://%s:%s/%s"' \
    "$3" "$1" "$PORT" "$2"
}
upstream_auth() { grep -F "$1|" "$LOG" | tail -n1 | cut -d'|' -f2-; }

1. Sandbox sees sentinel, not the real value

write_cfg "$WORK/s1.json" "[\"$HOST_A\"]" '{
  "envVars": [ { "name": "GH_TOKEN", "mode": "mask" } ],
  "allowPlaintextInject": true
}'
run_srt "$WORK/s1.json" 'printf "GH_TOKEN=%s\n" "$GH_TOKEN"'
echo "rc=$RC"; echo "$OUT"

Expected:

rc=0
GH_TOKEN=fake_value_<uuid>

The real value ($GH_REAL) never appears in $OUT.

2. No injectHosts → defaults to allowedDomains

write_cfg "$WORK/s2.json" "[\"$HOST_A\"]" '{
  "envVars": [ { "name": "GH_TOKEN", "mode": "mask" } ],
  "allowPlaintextInject": true
}'
run_srt "$WORK/s2.json" "$(incurl "$HOST_A" s2 GH_TOKEN)"
echo "rc=$RC"; upstream_auth s2

Expected:

rc=0
Bearer real-gh-secret-<pid>-<rand>

The upstream received the real value; no fake_value_ in the log line.

3. Per-credential injectHosts narrows

write_cfg "$WORK/s3.json" "[\"$HOST_A\", \"$HOST_B\"]" "{
  \"envVars\": [ { \"name\": \"GH_TOKEN\", \"mode\": \"mask\", \"injectHosts\": [\"$HOST_A\"] } ],
  \"allowPlaintextInject\": true
}"
run_srt "$WORK/s3.json" "$(incurl "$HOST_A" s3a GH_TOKEN); $(incurl "$HOST_B" s3b GH_TOKEN)"
echo "rc=$RC"
echo "A: $(upstream_auth s3a)"
echo "B: $(upstream_auth s3b)"

Expected:

rc=0
A: Bearer real-gh-secret-<pid>-<rand>
B: Bearer fake_value_<uuid>

Host A (in injectHosts) got the real value; host B (reachable but not in injectHosts) got the sentinel — fails closed.

4. Anti-laundering: sentinel A sent to credential B's host stays fake

write_cfg "$WORK/s4.json" "[\"$HOST_A\", \"$HOST_B\"]" "{
  \"envVars\": [
    { \"name\": \"GH_TOKEN\",  \"mode\": \"mask\", \"injectHosts\": [\"$HOST_A\"] },
    { \"name\": \"NPM_TOKEN\", \"mode\": \"mask\", \"injectHosts\": [\"$HOST_B\"] }
  ],
  \"allowPlaintextInject\": true
}"
run_srt "$WORK/s4.json" "$(incurl "$HOST_B" s4gh GH_TOKEN); $(incurl "$HOST_B" s4npm NPM_TOKEN)"
echo "rc=$RC"
echo "GH→B:  $(upstream_auth s4gh)"
echo "NPM→B: $(upstream_auth s4npm)"

Expected:

rc=0
GH→B:  Bearer fake_value_<uuid>
NPM→B: Bearer real-npm-secret-<pid>-<rand>

GH_TOKEN's sentinel sent to NPM_TOKEN's host is not swapped; NPM_TOKEN's own sentinel at its own host is.

5. Block-level credentials.injectHosts is rejected

write_cfg "$WORK/s5.json" "[\"$HOST_A\"]" "{
  \"injectHosts\": [\"$HOST_A\"],
  \"envVars\": [ { \"name\": \"GH_TOKEN\", \"mode\": \"mask\" } ],
  \"allowPlaintextInject\": true
}"
run_srt "$WORK/s5.json" 'echo SHOULD-NOT-RUN'
echo "rc=$RC"; echo "$OUT"

Expected: rc≠0, Could not load settings with Unrecognized key; SHOULD-NOT-RUN never prints. The credentials block is .strict() so a stale block-level injectHosts is refused rather than silently widening every credential.

6. Explicit per-entry injectHosts: [] is rejected

write_cfg "$WORK/s6.json" "[\"$HOST_A\"]" '{
  "envVars": [ { "name": "GH_TOKEN", "mode": "mask", "injectHosts": [] } ],
  "allowPlaintextInject": true
}'
run_srt "$WORK/s6.json" 'echo SHOULD-NOT-RUN'
echo "rc=$RC"; echo "$OUT"

Expected: rc≠0, error contains masked but never injected; SHOULD-NOT-RUN never prints.

7. Masking requires tlsTerminate or allowPlaintextInject

write_cfg "$WORK/s7.json" "[\"$HOST_A\"]" '{
  "envVars": [ { "name": "GH_TOKEN", "mode": "mask" } ]
}'
run_srt "$WORK/s7.json" 'echo SHOULD-NOT-RUN'
echo "rc=$RC"; echo "$OUT"

Expected: rc≠0, error contains requires network.tlsTerminate; SHOULD-NOT-RUN never prints.

How to test (macOS)

Same as the Linux section — only SRT_DIR differs (point it at your checkout). sandbox-exec replaces bwrap automatically; the upstream, incurl, and all seven scenarios are unchanged.

On macOS you can additionally exercise the real TLS path (without allowPlaintextInject) by adding "tlsTerminate": {} to the network block and trusting srt's ephemeral MITM CA in your keychain for the duration of the test. The substitution then runs in forwardUpstream after CONNECT termination instead of on the plaintext path.

…an optional narrowing When neither per-entry nor block-level injectHosts is set, a masked credential now defaults to network.allowedDomains (injectable at every host the sandbox can reach) instead of failing validation. injectHosts becomes an optional narrowing on top of the network allowlist rather than a required allowlist of its own. Trade-off: a credential is injectable to every reachable host unless explicitly narrowed. Configs that want a credential confined to a subset of allowedDomains must say so. An explicit empty injectHosts (per-entry or block-level inherited by a masked entry) is still rejected: mask-but-never-inject is contradictory and almost certainly a config mistake. The subset-of-allowedDomains check on any explicitly-set injectHosts and the tlsTerminate requirement are unchanged.

…ns default injectHosts now lives only on each masked env-var entry. With no per-entry list, the credential is injected at every host in network.allowedDomains (injectHosts is an optional narrowing). The credentials block is .strict() so a stale block-level injectHosts is rejected rather than silently stripped — that would otherwise widen the credential to every reachable host without the operator noticing.

…ns check The literal Set.has() check rejected injectHosts: ['api.github.com'] when allowedDomains: ['*.github.com'], even though api.github.com is reachable. The check now asks whether every host an injectHosts entry could match is reachable via allowedDomains: exact entries match against allowed patterns; a wildcard entry *.X requires an allowed wildcard *.Y with Y == X or Y an ancestor of X (an exact allowedDomain can never cover a wildcard). matchesDomainPattern moves from sandbox-manager to a new domain-pattern module so the schema can use it without a circular import.

elhajjj added 11 commits June 18, 2026 15:43

Add per-session sentinel registry for credential masking

bec9b56

Accept mode:mask for env vars; add injectHosts and allowPlaintextInject

c6ec3e3

Set masked env vars to a sentinel inside the sandbox

45c6764

Add forwarded-header mutation seam to the proxy

304bd6a

Substitute sentinel for real credential on egress to injectHosts

fe58c2b

Add tests for credential masking (registry, env, proxy injection, e2e)

143ddb3

Gate sentinel substitution per credential, keyed by name

4557ab0

Add per-credential injectHosts with block-level default and validation

e31eb05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add mode:"mask" for env vars — sentinel in sandbox, real value injected at the proxy#318

Add mode:"mask" for env vars — sentinel in sandbox, real value injected at the proxy#318
elhajjj wants to merge 11 commits into
mainfrom
elhajj/credential-mask-env

elhajjj commented Jun 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

elhajjj commented Jun 18, 2026

Problem

What this adds

Config example

Transport safety

Builds on #289

Evidence it works

Notes for reviewers

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant