Fleet 1.13:Teams are now shipping 5x more PRs with autonomous pipelines.See what's new →
FleetFleet
Use case

AI Agents for Pull Request Review

Pull request queues grow faster than reviewer bandwidth. On an active team, a developer might open three PRs in the time it takes a reviewer to work through one. The result is a stale queue, blocked merges, and context loss on both sides when review finally happens days later.

The deeper problem is that PR review bundles two distinct jobs: mechanical verification (does this compile, do tests pass, are there obvious bugs) and judgment (is this the right approach, does this fit the architecture). Humans are needed for judgment. The mechanical layer can be handled faster.

How it works with an agent fleet

Fleet separates the mechanical layer from the judgment layer. A qa-engineer agent runs automated checks on PR creation. A tech-lead agent handles architectural judgment. Both subscribe to the same pr_needs_review event and run in parallel.

# Start the watcher — it handles label-to-agent routing automatically
fleet watcher start

# Check which agents are active
fleet status

The watcher daemon polls GitHub labels every two minutes. When needs-review appears, it matches the event to subscribed agents and starts them in tmux sessions. Each agent runs Claude Code with a role-specific prompt.

The fleet pattern

Multi-reviewer chain: a qa-engineer handles test verification and a tech-lead handles architectural review, both triggered by pr_needs_review. Each publishes a structured fabric event (pr_approved or pr_changes_requested). The release-manager reads those events as merge gate conditions. Humans review the fleet log to spot patterns across PRs.

Guardrails that matter here

  • Merge gate keys on fabric review events plus the approved label — a release-manager will not merge without a `pr_approved` event and no later `pr_changes_requested`
  • Per-agent model selection: qa-engineer on a faster model for speed, tech-lead on a more capable model for judgment
  • A separate risk model flags agents at critical risk for auto-quarantine; their prompts get a human review

Who this is for

Teams with more than two active developers where PR review is creating a merge queue. Particularly useful when developers work across time zones and synchronous review scheduling is difficult.

Frequently asked questions

What happens if the agent misses something a human would catch?

The agent adds its review as a structured fabric event, but the full GitHub PR is still visible to your team. Humans review the actual PR and can add comments or request changes through normal GitHub flow. The agent is an additional pass, not a replacement.

How does the agent know our PR conventions?

You write the reviewer prompt. It can reference your CONTRIBUTING.md, describe your branching strategy, list required checks, and specify what warrants a change request versus a comment. The Fleet handbook is also compiled into every agent at launch, covering universal Git and communication rules.

Run your first agent fleet

One binary. Five minutes. See every agent, coordinate every handoff, and keep a full audit trail of what your fleet did.