Fleet 1.13:Teams are now shipping 5x more PRs with autonomous pipelines.See what's new →
FleetFleet
Guide

How to Secure AI Agents Running in Your Codebase

Letting AI agents write, review, and merge code in your repository is a real security decision. Each agent can read your source, run commands, push branches, and — if you let it — merge to main. The two questions that matter are where your code goes and what an agent is allowed to do when it misbehaves. A hosted agent platform answers the first question badly: your code and context flow through someone else's servers under opaque terms.

Fleet's security posture starts from a different place. It is a single self-hosted, local-first Go binary that stores state in ~/.fleet/fleet.db. An unregistered instance sends nothing to Fleet; a registered instance reports operational metadata and usage metering (agent status, run counts, run time) to the control plane — never your source code. Your code stays on your infrastructure — with one honest exception: agents send code to a model backend. That backend can be the Anthropic API, or Amazon Bedrock / Google Vertex running in your own cloud account, so the model traffic need never leave your boundary — though a model endpoint must always be reachable. On top of that local foundation, Fleet adds approval gates for risky actions, a risk model that auto-quarantines agents in a critical state, and least-privilege GitHub access. This guide walks through hardening each layer.

Before you start

  • Fleet installed from `curl -fsSL https://fleetctl.ai/install | sh` on a machine you control
  • Claude Code (`claude`) installed and authenticated
  • A GitHub repository and the `gh` CLI authenticated, or a GitHub App you can install
  • An understanding of which actions in your workflow are irreversible or high-stakes
1

Understand the trust boundary honestly

Fleet is a local process manager. The binary runs on your machine or server, stores everything in a local SQLite database at ~/.fleet/fleet.db, and runs local-first: an unregistered instance sends nothing, while a registered instance reports operational metadata and usage metering (agent status, run counts, run time) to the control plane — never your source code. The outbound paths you should name precisely: Claude Code reaches a model backend — the Anthropic API, or Amazon Bedrock / Google Vertex inside your own cloud (so the model can stay within your boundary, though a model endpoint is always required, which means Fleet is not a zero-egress air-gap) — the gh CLI calls the GitHub API, and if you connect a work tracker, Fleet polls Linear (api.linear.app) or Jira for issue metadata. That is the real boundary: your source code reaches only the model backend you choose and GitHub; work trackers exchange issue metadata, and the control plane sees operational metadata — not your source.

# What stays local (source of truth):
#   - the Fleet binary and all orchestration logic
#   - ~/.fleet/fleet.db (agents, events, run-time tracking)
#   - your source code never goes to Fleet
# What goes out:
#   - Claude Code   -> model backend (Anthropic API, or Bedrock/Vertex in your cloud)
#   - gh CLI        -> GitHub API
#   - workitems     -> Linear / Jira (issue metadata, only if connected)
#   - control plane -> operational metadata + usage metering (agent status,
#                      run counts, run time), only if registered with the dashboard
2

Add approval gates on irreversible actions

Full autonomy is fine for low-stakes work; it is not fine for merging a large refactor or any action that is hard to reverse. Fleet implements approval gates as required conditions on a subscription, checked by the watcher before an agent fires. Gate the release stage on verifiable GitHub state — a mergeable, checks-passing PR carrying the approved label — so a half-baked approval event alone cannot trigger a merge.

agents:
  - name: release-manager
    role: release-manager
    department: engineering
    subscribes_to: pr_approved
    subscription_gate:
      pr_checks_passing: true
      pr_mergeable: true
      label_present: approved
3

Run the brain so the risk model can quarantine

Fleet's risk model is a separate logistic-regression model over operational signals — error rate, restarts, blocked tasks, silent hours, uptime, eval score, and SLA compliance. It is not the same thing as the 6-dimension quality evaluation; it exists specifically to catch an agent drifting into a dangerous operational state. When an agent's risk level reaches critical, the brain quarantines it: the session is stopped and a quarantine event is published. Quarantine fires at the critical level — it is not an arbitrary number you tune.

# Start the brain to enable risk assessment + auto-quarantine:
fleet brain start

# See current risk assessments and flagged agents:
fleet brain insights
4

Give agents least-privilege GitHub access

By default agents act through the gh CLI under whatever identity it is authenticated as. For tighter scoping, run agents under a GitHub App with only the permissions the workflow needs (contents, pull requests, issues) rather than a broad personal token. Set use_github_app on the agent so it authenticates as the App installation. Grant the App the minimum repository scope and nothing more.

agents:
  - name: backend-dev
    role: backend-developer
    department: engineering
    reports_to: tech-lead
    use_github_app: true
5

Keep the audit trail and review it

Every agent action — PR opened, review published, merge, quarantine — is recorded as a structured event in the local decision log. This is your forensic record: who did what, when, and why. Review it regularly, and especially after any quarantine, before you restart an agent. A quarantined agent is never auto-restarted; reactivation requires a human, by design.

# Full decision trail for an agent:
fleet log --agent backend-dev --since 7d

# Just the significant decisions across the fleet:
fleet log --type decision --since 30d
6

Constrain blast radius with worktrees and run-time budgets

Limit what a single agent can disrupt. Give developer agents isolated git worktrees so one agent's mistakes cannot corrupt another's working tree, and set a run-time budget (cumulative run duration in seconds) so a looping session is stopped rather than left running for hours. The budget is a time cap, not a token cap — Fleet does not meter tokens.

agents:
  - name: backend-dev
    role: backend-developer
    worktree: /path/to/repo-backend
    max_concurrent: 1

Common pitfalls

  • Fleet is self-hosted and local-first — your source code never goes to Fleet — but it is not a zero-egress air-gap: Claude Code must reach a model backend, and a registered instance reports operational metadata and usage metering to the control plane. Point it at Amazon Bedrock or Google Vertex to keep that traffic inside your own cloud — but a model endpoint is always required, so claims of fully offline or SCIF operation are false. Plan your network policy around an outbound HTTPS path to your model backend, GitHub, and any work-item provider you connect (Linear or Jira).
  • The risk model and the 6-dimension evaluation are two different systems. The evaluation scores quality; the risk model drives quarantine. Do not assume a high quality score means low risk — read them independently in `fleet brain insights`.
  • Quarantine only works if the brain daemon is running. Without `fleet brain start`, there is no risk assessment and no auto-quarantine. An unmonitored fleet has no safety net beyond your run-time budgets.
  • Approval gates are only as strong as the verifiable state they check. A gate that merely waits for a fabric event with no GitHub-state conditions can be unblocked by a stray published event. Gate on `pr_mergeable` and `pr_checks_passing`, not just on an approval event existing.
  • Authenticating agents under a broad personal access token gives every agent your full account scope. Prefer a GitHub App (`use_github_app: true`) scoped to the minimum permissions, so a compromised or rogue agent cannot reach repositories the workflow never needed.

When Fleet is the right tool

Fleet is the right choice when keeping source code and orchestration on infrastructure you control is a hard requirement and you still want autonomous agents. It gives you a local-first binary whose source of truth stays on your own infrastructure — your source code never goes to Fleet — plus approval gates and an auto-quarantine safety net. (A registered instance reports operational metadata and usage metering to the control-plane dashboard, never your source.) It is honestly not for you if your security model forbids any outbound connection at all — Claude Code's need for a model backend rules out a true zero-egress air-gap. But with Amazon Bedrock or Google Vertex you can keep both your code and the model traffic inside your own cloud account. Within that boundary, Fleet's posture is about as locked-down as running real coding agents gets.

Frequently asked questions

Does my source code leave my infrastructure?

Only to the model backend (the Anthropic API, or Amazon Bedrock / Google Vertex in your own cloud) and the `gh` CLI's calls to GitHub. If you connect Linear or Jira, Fleet exchanges issue metadata with them — not your source code. Claude Code's model backend is configurable: use the Anthropic API directly, or point it at Amazon Bedrock / Google Vertex to keep model traffic inside your own cloud account. Your source code never goes to Fleet — it is a local-first binary, and a registered instance reports only operational metadata and usage metering (agent status, run counts, run time) to the control plane, not your source.

Can I run Fleet fully air-gapped?

Not a true zero-egress air-gap — Claude Code always needs to reach a model backend. The model backend is configurable: use the Anthropic API directly, or point Claude Code at Amazon Bedrock or Google Vertex to keep model traffic inside your own cloud account. Fleet itself adds no cloud dependency beyond GitHub access.

What stops a misbehaving agent from doing damage?

Two layers: approval gates that require verifiable GitHub state before risky actions like merges, and a separate risk model that auto-quarantines an agent when its risk level reaches critical. A quarantined agent is stopped and never auto-restarted.

How do I limit what an agent can do on GitHub?

Set `use_github_app: true` on the agent and install a GitHub App scoped to only the permissions the workflow needs. This is least-privilege access — far tighter than a broad personal access token shared across every agent.

Run your first agent fleet

One binary. Five minutes. See every agent, coordinate every handoff, and keep a full audit trail of what your fleet did.