Running a single AI coding agent is straightforward. Running several of them together — where a developer agent hands work to a reviewer, who hands it to a release manager, without you in every loop — requires coordination infrastructure that most teams build ad hoc or skip entirely.
Fleet is that infrastructure. It is a single Go binary you install on your own machine or server. It does not run a model itself. Instead it orchestrates the Claude Code agents you already have — assigning them roles, budgets, and subscriptions to GitHub events. By the end of this walkthrough, you will have a team of agents that picks up a GitHub issue when you add a ready label, writes the code, opens a PR, gets it reviewed, and ships it. Your job is to write the ticket and set the label.
Before you start — what you need
Fleet itself has minimal dependencies. You need tmux installed (agents run in named tmux sessions), a GitHub repository your agents will work in, and Claude Code installed for each agent runtime. Fleet communicates with GitHub through the gh CLI, so that needs to be installed and authenticated as well.
On the machine where Fleet runs, you need a valid Claude Code license for each concurrent agent session you intend to run. Fleet does not bundle any model access. It coordinates agents; the agents bring their own credentials. If you are running two developer agents plus a reviewer and a release manager, you need four sessions worth of access.
Download the Fleet binary from fleetctl.ai. Drop it somewhere on your PATH. That is the entire installation.
Step 1: Define your agents
Run fleet init in your repository. Fleet will look for an existing .fleet/config.yaml, and if it does not find one, it will create a scaffold and walk you through registering your first agents. It also installs the Fleet skills — reusable workflow files that teach agents how to handle tickets, review PRs, and ship releases — into your Claude Code skills directory.
The core config is a YAML file. Here is a minimal four-agent team:
product:
name: my-api
owner: your-github-username
agents:
- name: backend-dev
role: backend-developer
department: engineering
reports_to: tech-lead
model: claude-sonnet-4-5
- name: tech-lead
role: tech-lead
department: engineering
reports_to: eng-manager
model: claude-opus-4
- name: qa-lead
role: qa-lead
department: engineering
reports_to: eng-manager
model: claude-sonnet-4-5
- name: release-manager
role: release-manager
department: engineering
reports_to: eng-manager
model: claude-sonnet-4-5
A few things worth noting here. Each agent has a distinct role — backend-developer, tech-lead, qa-lead, release-manager — and Fleet uses that role to determine which skill the agent runs when it starts. The tech-lead runs /fleet-review-pr. The release manager runs /fleet-ship-pr. You do not configure this per-agent; the role drives it automatically.
The model field lets you assign different models to different roles. Judgment-heavy work like code review benefits from a more capable model. Mechanical work like triage or status checks can use a cheaper one. Per-agent budgets — a cumulative run-time ceiling, set when you create the agent rather than in this config — give you a separate guardrail; we cover them in Step 4.
Fleet ships 120+ agent prompt templates in internal/catalog/agents/ if you want a starting point for more specialized roles (security reviewer, data engineer, documentation writer, and so on).
Step 2: Start the agents
With your config in place, start a team:
fleet agent start --team engineering
Or start everything at once:
fleet agent start --all
Each agent gets its own tmux session named after the agent. You can attach to any session with tmux attach -t backend-dev to watch what it is doing. Fleet passes each agent a set of environment variables when it starts — the agent name, role, department, the GitHub repo it belongs to, any triggering event — so agents know their context without you having to repeat it in every prompt.
Check the overall state of your fleet at any time:
fleet status
This gives you a color-coded summary: which agents are running, which are idle, which are blocked waiting on something. It accepts --json if you want to pipe it into a dashboard or alerting system.
Step 3: Wire the autonomous chain
Individual agents running in tmux sessions are useful. The autonomous chain is what makes them a fleet.
Start the watcher daemon:
fleet watcher start --supervised
The --supervised flag keeps the watcher managed by your shell's supervisor rather than running fully detached, which is recommended until you have verified the chain works end to end. The watcher runs three loops: a label poller that checks GitHub every two minutes for new label changes, a subscription processor that matches new fabric events to agent subscriptions every ten seconds, and a scheduler that fires agents on any cron schedules you have defined.
Here is the reactive chain in practice. You write a GitHub issue describing a feature or bug fix. When you are ready for an agent to pick it up, you add a ready label. The watcher sees the label change, publishes a ticket_ready event to Fleet's internal event bus (called fabric), and the subscription processor matches that event to your backend-dev agent's subscription. The developer agent starts in its tmux session, reads the ticket, creates a branch, writes code, runs tests, opens a PR, and adds a needs-review label. The watcher picks up that label, publishes pr_needs_review, and the tech-lead and qa-lead agents start. They review the PR — the skill handles checking out the branch, running the review, and posting comments. If they approve, they publish a pr_approved event and add an approved label. The release manager's subscription matches, it starts, verifies the merge gate (approved label plus pr_approved fabric event from a reviewer), and merges the PR. The ticket is shipped.
You did not touch any of the handoffs. You wrote the ticket, set the label, and the chain handled the rest.
Step 4: Add guardrails
Before you scale past a handful of agents or hand over tickets with real production impact, add guardrails. Three matter most.
Approval gates. For any pipeline stage where you want a human checkpoint, Fleet supports explicit approval gates. The agent pauses at the gate and waits for a signal before proceeding. Use these for anything that touches production infrastructure or has security implications.
Per-agent budgets. Each agent can carry a budget — a cumulative run-time ceiling, in seconds, set when you create the agent. Fleet tracks total run duration per agent and refuses to launch a new session once the agent is over budget (it returns an over-budget error rather than starting). Set these conservatively at first. If an agent keeps hitting its ceiling, the tickets are probably underspecified, not the budget too tight.
Risk scoring and quarantine. Fleet runs a 6-dimension evaluation on each agent's output — task completion, reliability, quality, efficiency, collaboration, cost efficiency — and a separate logistic regression risk model on eight features derived from the agent's behavior. If the risk model rates an agent critical, Fleet quarantines it: it stops the session and flags it for your review before it runs again. This catches agents that have started exhibiting erratic behavior — looping on the same action, making unexpectedly large changes, hitting APIs at high frequency — before they do real damage.
These are not optional features you add when you have time. Add them before you let the watcher run unattended overnight. The cost of a runaway agent session is real, and the guardrails exist precisely because autonomous systems fail in ways that are hard to predict from the happy path.
Going further
Once the basic chain is running reliably, there are several directions to go. You can define additional agent roles — a dedicated security reviewer that triggers on PRs touching authentication or payment code, a documentation writer that triggers when a PR is shipped, a dependency triage agent that runs on a schedule. The 120+ prompt templates give you a starting point for most common roles without writing prompts from scratch.
Org-level agents (defined in ~/.fleet/org.yaml rather than a repo's .fleet/config.yaml) can see events from all repositories, which is useful for roles like a CTO or CPO agent that needs cross-repo visibility.
Fleet's audit trail — accessible via fleet log — gives you a unified timeline of every decision event, fabric message, and agent action across your entire fleet. When something goes wrong, this is where you start.
Full documentation is at fleetctl.ai/docs. Fleet is free for a single agent slot; the Team plan is $49 per slot per month for larger rosters. Start with one agent, verify the chain, then scale.