Fleet 1.13:Teams are now shipping 5x more PRs with autonomous pipelines.See what's new →
FleetFleet
Guide

How to Reduce the Code Review Bottleneck with AI Agents

When AI coding agents generate pull requests faster than humans can review them, code review becomes the constraint on the whole pipeline. PRs pile up, the developer sits idle waiting for feedback, and the cycle time that automation was supposed to shorten gets worse, not better. The fix is not to skip review — it is to move the routine, mechanical part of review off the human's plate.

Fleet does this with reviewer agents that auto-start when a PR enters review, run a structured review skill, and publish their decision to an event bus called Fabric. A release-manager agent then consults a deterministic merge gate before merging. This guide covers the business case, the setup, and an honest accounting of what still requires a human.

Before you start

  • Fleet installed, initialized (`fleet init`), and a watcher running in your repository
  • A reviewer-role agent created from a template: `tech-lead`, `qa-lead`, or `pr-reviewer`
  • Fleet skills installed: `fleet skills install`
  • GitHub CLI (`gh`) authenticated with access to pull requests
1

Name the bottleneck before automating it

Measure where time actually goes. If your developer agents open PRs that then wait hours for a human to start reviewing, the constraint is review latency, not review quality. Latency is the part an agent can absorb — first-pass structural checks on every PR, applied consistently, within seconds. Judgment-heavy review (architecture, business logic, security trade-offs) stays with humans. Automate the latency, keep the judgment.

# A quick way to see where PRs sit:
fleet log --type decision --since 7d
# Look for long gaps between pr_needs_review and pr_approved
2

Create a reviewer agent

Create a reviewer agent from a template and give it a reviewer role. The role — tech-lead, qa-lead, or pr-reviewer — is what tells Fleet to inject the /fleet-review-pr skill directive into the agent's handbook. Subscribe it to pr_needs_review so it fires when a PR is ready for review.

fleet agent create --name tech-lead --vendor claude-code --template tech-lead

# In .fleet/config.yaml:
agents:
  - name: tech-lead
    role: tech-lead
    department: engineering
    reports_to: cto
    model: claude-opus-4-5
    subscribes_to: pr_needs_review
3

Install the review skill

The /fleet-review-pr skill is compiled into the Fleet binary and must be synced to your ~/.claude/skills/fleet/ directory before a reviewer agent can run it. Run fleet skills install now, and again after every fleet upgrade so skill fixes ship with the binary.

fleet skills install

# Confirm the review skill is present:
fleet skills list
4

Let a PR trigger the reviewer automatically

When a developer opens a PR and adds the needs-review label, the watcher publishes a pr_needs_review fabric event. The subscription processor matches it to your reviewer, which auto-starts, runs /fleet-review-pr, checks out the branch, and publishes pr_approved or pr_changes_requested back to Fabric — optionally posting GitHub comments for actionable findings.

# Trigger via label:
gh pr edit 57 --add-label needs-review

# Watch the reviewer's decision land:
fleet log --type decision --agent tech-lead
5

Understand why Fabric is the source of truth

Fleet agents share one GitHub identity, so gh pr review --approve fails with "authors cannot approve their own PR" for fleet-authored PRs. The review skill is built for this: it publishes the pr_approved fabric event regardless of whether the formal GitHub approval command succeeds. Downstream, the merge gate trusts the fabric event — GitHub's review decision is one signal among several, not the only one.

# The reviewer skill always publishes the fabric event:
fleet fabric publish --kind pr_approved --sender tech-lead \
  --summary "Approved PR #57" --payload '{"pr": 57}'
6

Gate the merge with `fleet release check`

The release-manager runs fleet release check <pr> (via its /fleet-ship-pr skill) before merging. The command emits JSON describing whether the PR is mergeable and why. It passes if EITHER GitHub's reviewDecision is APPROVED, OR the PR carries the approved label plus a pr_approved fabric event from a reviewer with no later pr_changes_requested. Fabric is the deciding signal because the shared GitHub identity makes the native approval unreliable.

fleet release check 57

# JSON output, for example:
# {
#   "pr": 57,
#   "mergeable": true,
#   "reason": "approved",
#   "detail": "approved label + pr_approved fabric event, no later changes-requested",
#   "approved_by": "tech-lead",
#   "events_considered": 3
# }
7

Scale reviewers to the volume

If one reviewer cannot keep up with PR volume, add more reviewer agents — for example a qa-lead focused on test coverage alongside a tech-lead focused on structure. Each subscribes to the same pr_needs_review event. Use max_concurrent to bound how many reviewer sessions start at once so a burst of PRs does not spin up an unbounded number of sessions.

agents:
  - name: tech-lead
    role: tech-lead
    subscribes_to: pr_needs_review
    max_concurrent: 2
  - name: qa-lead
    role: qa-lead
    subscribes_to: pr_needs_review
    max_concurrent: 1

Common pitfalls

  • The reviewer agent is a first-pass structural check, not a replacement for human review on security-sensitive or architecture-changing PRs. Keep a human in the loop for the decisions that carry real risk.
  • If skills are not installed or are stale, the reviewer starts but does not know how to run `/fleet-review-pr` and exits doing nothing. Run `fleet skills install --dry-run` first whenever a reviewer seems inert.
  • An "approved" label alone is not an approval. The merge gate requires the label AND a `pr_approved` fabric event from a reviewer with no later `pr_changes_requested`. Do not configure anything that sets the label without a real review behind it.
  • `fleet release check` outputs JSON, not a one-word verdict. Parse the `mergeable` and `reason` fields rather than grepping for a fixed string — the `detail`, `approved_by`, and `events_considered` fields explain the decision.
  • A reviewer that exits before publishing its decision event breaks the chain silently. The release-manager will wait indefinitely. Check `fleet log --type decision` when a stage stalls.

When Fleet is the right tool

Autonomous review pays off when Claude Code developers are producing PRs faster than your humans can give them a first look, and the wait is the bottleneck. It does not replace review where the value is deep domain judgment — Fleet absorbs the mechanical first pass and the merge bookkeeping, freeing humans for the calls only they can make. If your reviews are slow because they involve genuine architectural debate, automating the mechanical layer will not move that constraint; it will just stop trivial PRs from queuing behind the hard ones.

Frequently asked questions

Why does Fleet trust a fabric event over GitHub's own approval?

Fleet agents share one GitHub identity, so `gh pr review --approve` fails with "authors cannot approve their own PR" on fleet-authored PRs. The reviewer always publishes a `pr_approved` fabric event, and the merge gate treats that as the source of truth, with GitHub's review decision as one additional signal.

What does the merge gate actually require to pass?

`fleet release check <pr>` passes if EITHER GitHub's reviewDecision is APPROVED, OR the PR has the `approved` label plus a `pr_approved` fabric event and no later `pr_changes_requested`. It returns JSON with `mergeable`, `reason`, `approved_by`, and `events_considered`.

Which roles can act as autonomous reviewers?

Use a reviewer role: `tech-lead`, `qa-lead`, or `pr-reviewer`. These roles inject the `/fleet-review-pr` skill directive into the agent's handbook. Generic strings like `reviewer` inject nothing.

What still needs a human after I set this up?

Judgment-heavy review: security-sensitive changes, architectural decisions, and business-logic validation that requires domain knowledge. The agents handle the structural first pass and the merge bookkeeping; humans make the high-stakes calls.

Run your first agent fleet

One binary. Five minutes. See every agent, coordinate every handoff, and keep a full audit trail of what your fleet did.