Security
Internal developer/operator reference (like the ToS-compliance audit, added by PR #646). Exempt from i18n and the feature catalog — not app chrome.
Shepherd wraps each spawned claude agent in a bubblewrap (bwrap)
filesystem/process membrane with three profiles: trusted (no sandbox, the
default), standard (membrane, interactive-only) and autonomous (membrane +
per-spawn network-egress firewall; required for auto=true drain/autopilot).
The egress firewall (slirp4netns + nftables + dnsmasq, shipped in PR #601,
closed #551 — src/egress.ts) confines outbound traffic to
api.anthropic.com + statsig.anthropic.com + the GitHub hosts, and watches
for DNS drops. Egress is keyed to the autonomous profile, not to
attendedness (src/sandbox.ts egressApplies, ~L450).
This note records two residuals the operator has accepted after the audit.
R3 — in-membrane token readability (accepted)
Section titled “R3 — in-membrane token readability (accepted)”The membrane keeps two token surfaces readable to any in-membrane tool call:
~/.claude/.credentials.json— bound RW so OAuth refresh writes back (src/sandbox.ts:296-299,--bind-try); the whole~/.claudedir is--ro-binded atsrc/sandbox.ts:280-282.~/.config/gh— bound RO (the gh token, needed togit push/gh pr create) atsrc/sandbox.ts:318.
--clearenv (src/sandbox.ts:335) strips all inherited env
(ANTHROPIC_API_KEY, CLAUDE_CODE_OAUTH_TOKEN, GH_TOKEN, SHEPHERD_TOKEN,
…), re-setting only HOME/PATH/TERM + non-secret locale vars — so these two
bound files are the only token surfaces left inside the membrane.
Why accepted. A single-uid bwrap membrane has no privilege boundary
between claude and its own tool calls: any file claude reads to authenticate,
an injected tool call can also read. So the tokens the session legitimately needs
stay reachable by a hijacked agent.
Out of scope, not impossible. A narrowing is technically conceivable — e.g.
a nested sub-membrane for the Bash tool that omits the credentials bind, so
claude reads the OAuth token only at startup and tool calls can’t. It was not
pursued: significant nesting complexity, and a full external broker
(intercepting/spoofing claude’s auth to api.anthropic.com) would edge into
the prohibited “third-party harness piloting the account” conduct (audit R6) and
break the unmodified-CLI + subscription-OAuth-refresh stance.
Compensating controls. On the autonomous profile the egress allowlist
(#601) bounds where a leaked token can go; --clearenv keeps env-resident
secrets out of the membrane entirely.
Attended-mode egress coverage
Section titled “Attended-mode egress coverage”Egress confinement is keyed to the autonomous profile, not to whether a human
is watching (src/service.ts ~L757-768, L792-828): the wrap applies iff the
autonomous profile resolves and the fs + egress backends are present,
independent of ctx.auto. Consequences:
- An attended session on the autonomous profile is egress-confined (with an egress-degraded banner if the backend is missing).
- The default
trustedprofile andstandardare filesystem-confined only, never network-confined.
To get attended network confinement, select the autonomous profile — per-repo
in the repo’s Settings panel, or globally via SHEPHERD_SANDBOX_DEFAULT_PROFILE.
R4 — prompt-injection posture
Section titled “R4 — prompt-injection posture”-
Autonomous task agents run
--dangerously-skip-permissions, but behind both the filesystem and the egress membrane.standardauto-spawns are refused outright (src/sandbox.tsautoHoldReason, ~L398-399). -
Unattended reviewers (PR critic + plan-gate) run read-only, not skip-permissions:
--safe-mode --disable-slash-commands --allowedTools Read Grep Glob Bash(git diff *) Bash(git log *) Bash(git show *) Bash(git status) Write --permission-mode dontAsk(src/reviewer-argv.ts:13-109,readonlyReviewerArgv). -
Research is the deliberately egress-UNCONFINED surface. A research session that would resolve to
autonomousis downgraded tostandard(src/service.tsresearchSafeProfileOverride, ~L923-941, warns once), because research needs open web egress (search/fetch + sub-agents) that the autonomous firewall would block. It is operator-created (cannot be auto-drained —standardrefuses auto-spawn) but autopilot-steerable, so it runs unattended in practice (RESEARCH_PROCEED_STEER,src/autopilot.ts:22-27, dispatched at L200). It ingests untrusted web content ontrusted/standardwith the network open, and cangh pr create/ open issues via the bound gh token — so a hijacked research agent has both readable tokens and open egress.Compensating factors: the downgrade is explicit and warns once; research delivers a report PR or GitHub issue only, never a code PR (
src/autopilot.ts:204-206). The residual is accepted.
See also
Section titled “See also”docs/research/claude-anthropic-tos-compliance-audit.md— the full audit (added by PR #646; may not be merged at time of writing).src/egress.ts,src/sandbox.ts,src/service.ts,src/autopilot.ts,src/reviewer-argv.ts.