Does Fleet cap the tokens an agent can use?

No. Fleet's budget is cumulative run duration in seconds, not tokens. Token usage is not metered or enforced by Fleet. For a dollar ceiling, set an account-level spending limit in your Anthropic dashboard.

Does Fleet automatically restart an agent that was stopped?

No. Fleet stops the session and records the reason. You decide whether to restart it. If the watcher is running, it will not relaunch a stopped agent unless a new triggering event arrives.

How to Control AI Agent Run Time with Per-Agent Budgets

Name: Fleet
Author: Fleet

AI coding agents can run for a surprisingly long time in a single session, especially when they iterate on tests, read large files, or get stuck in a retry loop. A run-time budget is the most direct way to cap how long any one agent keeps working before Fleet steps in.

Fleet tracks each agent's cumulative run duration in seconds and enforces the limit you define. The budget is a wall-clock run-time budget — Fleet does not meter or cap tokens. An agent that exhausts its run-time budget is stopped, not just warned. This guide covers how to set and monitor agent run-time budgets.

Before you start

Fleet installed and initialized
Agents defined in `.fleet/config.yaml`
Claude Code installed and authenticated

Track run time with fleet agent budget

Fleet records cumulative run duration in seconds per agent and compares it against the agent's configured run-time budget. Use fleet agent budget to see each agent's utilization. This reads from Fleet's local tracking database. Note that the budget is run-time only — Fleet does not track or cap token consumption.

fleet agent budget

Choose models deliberately per agent

Not every agent needs Opus. Reviewer agents doing structural checks and QA agents running test suites can use Sonnet at lower cost and latency. Reserve Opus for agents that make architectural decisions or write complex algorithms. See the model-per-agent guide for a full breakdown.

# Deliberate model assignment:
# backend-developer (complex logic): claude-opus-4-5
# tech-lead (structural review): claude-sonnet-4-5
# qa-engineer (test execution): claude-sonnet-4-5
# release-manager (scripted workflow): claude-sonnet-4-5

Cap parallelism with max_concurrent

Run time is per agent, but total machine load is the sum of all running agents. Use max_concurrent on each agent to bound how many sessions of that agent the subscription matcher will start at once. A strictly serial developer (the default of 1) and a never-throttled monitor are the two ends of the spectrum.

agents:
  - name: backend-dev
    role: backend-developer
    model: claude-opus-4-5
    max_concurrent: 1
  - name: tech-lead
    role: tech-lead
    model: claude-sonnet-4-5
    max_concurrent: 2

Review the audit trail for runaway sessions

If an agent runs longer than expected, use fleet log to see what it was doing. The decision log shows every significant action, which helps identify where a session burned time (infinite loops, large file reads, repeated retries).

fleet log --agent backend-dev --since 4h

Stop a long-running agent manually

If you need to stop an agent immediately, use fleet agent stop. Fleet terminates the tmux session cleanly.

fleet agent stop backend-dev
# Stop all running agents if things are out of control:
fleet agent stop --all

Common pitfalls

Fleet's run-time tracking covers Fleet-managed sessions only. A Claude Code session you started manually outside Fleet is not tracked.
Fleet measures wall-clock run time, not active compute time. An agent waiting for a long test suite to run still accrues run-time seconds.
Fleet does not meter tokens. If you need a hard spend ceiling, set an account-level spending limit in your Anthropic dashboard — Fleet's budget governs run time, not dollars.
Stopping an agent mid-task can leave branches in a broken state. Check the repository after a manual stop and clean up partial commits or open PRs.

When Fleet is the right tool

Run-time budgets in Fleet are useful the moment you have agents running unattended, because a stuck session can otherwise spin for hours. Pair the run-time budget with an Anthropic account-level spending limit when you also want a hard dollar ceiling — Fleet governs time, your Anthropic account governs spend.