Fleet 1.13:Teams are now shipping 5x more PRs with autonomous pipelines.See what's new →
FleetFleet
Use case

AI Agents for Code Review

Code review is one of the highest-leverage activities on an engineering team, and one of the most consistently delayed. Reviewers context-switch from deep work to evaluate a PR, often hours or days after the author submitted it. By the time feedback arrives, the author has moved on mentally and re-context-switching is expensive.

At the same time, review quality is uneven. A tired reviewer at the end of a sprint catches fewer issues than the same person on a fresh Monday. Automated linters catch syntax but miss logic errors, missing test coverage, inconsistent error handling, and violations of your team's specific architectural conventions.

How it works with an agent fleet

Fleet runs a dedicated reviewer agent with a role-specific prompt. When a PR is opened and the needs-review label is applied, the watcher daemon fires the agent automatically. The agent checks out the branch, runs your test suite, and publishes a structured review to fabric.

# .fleet/config.yaml
agents:
  - name: tech-lead
    role: tech-lead
    model: claude-opus-4-7
    subscribes_to: pr_needs_review

The agent's prompt lives at .fleet/prompts/tech-lead.md — Fleet resolves it by convention from the agent name, so there is no prompt: field in config. The reviewer prompt instructs the agent to check for missing error handling, test coverage gaps, and adherence to your documented patterns. It publishes a pr_approved or pr_changes_requested fabric event, which the release manager uses as a merge gate signal.

The fleet pattern

The standard pattern is: developer agent opens PR and adds needs-review label, watcher dispatches the tech-lead agent, tech-lead reviews and publishes a fabric decision event, release-manager gates merge on that event. The human engineering lead stays in the loop by watching fleet log --type decision and can override or add comments at any point.

Guardrails that matter here

  • Per-agent run-time (duration) budget prevents runaway review sessions on large diffs
  • A separate risk model over operational signals (error rate, restarts, blocked tasks, uptime, eval score, SLA) drives auto-quarantine when an agent reaches critical risk
  • Full audit trail: every review decision is logged with timestamp, agent, and fabric event ID

Who this is for

Engineering teams where PR review is a consistent bottleneck, or where review quality varies enough that some PRs ship with known issues. Works best when you have documented coding standards that can be encoded in the reviewer prompt.

Frequently asked questions

Does the agent approve PRs without a human seeing them?

No. The agent publishes a fabric event that the release-manager uses as a signal, but you control whether that signal alone is sufficient to merge. Most teams configure a human approval gate for anything touching core infrastructure.

Can the reviewer agent enforce our specific architecture rules?

Yes. The reviewer prompt is a plain markdown file you write and commit. You describe your conventions explicitly — naming patterns, error handling approach, required test structure — and the agent applies them on every review.

Run your first agent fleet

One binary. Five minutes. See every agent, coordinate every handoff, and keep a full audit trail of what your fleet did.