Fleet 1.13:Teams are now shipping 5x more PRs with autonomous pipelines.See what's new →
FleetFleet
Guide

How to Choose the Right Model for Each AI Agent

Assigning the same model to every agent wastes money on routine tasks and underperforms on complex ones. A release manager running a scripted merge sequence does not need Opus. A developer agent writing a novel authentication system probably does. Model selection per agent is one of the highest-leverage cost and quality levers you have.

Fleet supports per-agent model assignment via a single field in config. This guide covers how to think about model selection and shows the configuration pattern.

Before you start

  • Fleet initialized with at least two agents in `.fleet/config.yaml`
  • Access to the Anthropic model lineup you want to use (check your tier's access)
  • Some baseline data on what each agent actually does in your workflow
1

Understand which tasks are judgment-heavy vs procedural

The key question is: does this task require novel reasoning, or is it following a well-defined procedure? Developer agents writing complex features need strong reasoning. Reviewer agents doing structural checks, QA agents running tests, and release managers merging PRs are more procedural. Assign Opus to the former and Sonnet to the latter.

# Rough model assignment heuristics:
# - Writing complex algorithms, architecture decisions -> Opus
# - Structural code review, test execution -> Sonnet
# - Scripted release workflows -> Sonnet
# - Research, analysis, ambiguous problem solving -> Opus
2

Set the model field per agent in config

Add a model field to each agent definition in .fleet/config.yaml. Fleet passes this as the --model flag when launching the Claude Code session. If you omit the field, Fleet uses the Claude CLI's default model.

agents:
  - name: backend-dev
    role: backend-developer
    model: claude-opus-4-5
    department: engineering
    reports_to: tech-lead
  - name: frontend-dev
    role: frontend-developer
    model: claude-sonnet-4-5
    department: engineering
    reports_to: tech-lead
  - name: tech-lead
    role: tech-lead
    model: claude-opus-4-5
    department: engineering
  - name: qa-agent
    role: qa-engineer
    model: claude-sonnet-4-5
    department: engineering
  - name: release-manager
    role: release-manager
    model: claude-sonnet-4-5
    department: engineering
3

Match the model to the task scope

A lower-cost model is often the better default — Sonnet can handle many iterations cheaply, leaving Opus for the agents whose output quality actually depends on it. Assign the model that fits each agent's typical task, then let Fleet's run-time budget (cumulative run duration in seconds) bound how long any single session keeps working.

agents:
  - name: backend-dev
    role: backend-developer
    model: claude-opus-4-5  # complex logic earns Opus
  - name: qa-agent
    role: qa-engineer
    model: claude-sonnet-4-5  # test runs are cheap on Sonnet
4

Monitor cost by agent over time

After running agents for a week, check fleet agent budget for run-time utilization and review fleet log to gauge output quality. Agents that consistently need multiple review rounds may warrant a more capable model; agents that breeze through on Sonnet are well-provisioned.

fleet agent budget
fleet log --agent backend-dev --since 7d

Common pitfalls

  • A cost-optimized model assignment that degrades quality costs more in the long run through rework and failed reviews. Measure quality outcomes (PR acceptance rate, review round trips) alongside cost.
  • Model names change over time. Anthropic releases new models and deprecates old ones. Hardcoding a specific model version in config means you will not automatically benefit from improvements. Review your model assignments quarterly.
  • Some tasks that appear procedural have edge cases that require real reasoning. If a Sonnet-assigned agent consistently produces work that requires multiple review rounds, consider upgrading its model before concluding the task spec is the problem.
  • Token count alone is not a complete cost model. Opus charges more per token than Sonnet. The relevant number is (tokens used) x (price per token). Use the Anthropic pricing page to calculate actual expected costs per agent per run.

When Fleet is the right tool

Per-agent model selection in Fleet is worth the configuration effort once you have more than two or three agents running regularly. Below that scale, the simplicity of one model for everything may outweigh the optimization gains. Above that scale, a 5x cost difference between Opus and Sonnet adds up quickly across a fleet of eight agents running hundreds of tasks per month.

Frequently asked questions

Can I change an agent's model without restarting it?

The model is set at session launch time. If the agent is currently running, the change takes effect the next time it starts a new session. Stop the agent with `fleet agent stop <name>` and then restart it.

What happens if I specify a model the Claude CLI does not recognize?

The Claude CLI will return an error when the session starts, and the agent will fail to launch. Fleet logs the error in the agent's tmux session. Double-check model names against the current Anthropic model list.

Run your first agent fleet

One binary. Five minutes. See every agent, coordinate every handoff, and keep a full audit trail of what your fleet did.