Fleet 1.13:Teams are now shipping 5x more PRs with autonomous pipelines.See what's new →
FleetFleet
Glossary

Agent Token Budget

An agent token budget is a preconfigured limit on the total number of tokens an AI agent may consume in a single session or over a given period, used to control cost and prevent runaway execution.

Token usage is the primary cost driver for AI agents. A developer agent completing a complex feature might use 500,000 tokens in a session; if it gets stuck in a loop or tackles an unexpectedly large task, that number can grow by an order of magnitude without a budget constraint. At typical API pricing, this translates directly to unexpected billing.

Budgets also serve a safety function beyond cost control. An agent that exceeds its token budget is either working on something much larger than expected (possibly scope creep) or stuck in a loop. Either way, it is a signal worth investigating. A budget limit forces that investigation to happen before costs become significant.

Budgets can be set at multiple levels: per-session (stop when this task exceeds N tokens), per-day (cap this agent's daily consumption), or per-project (aggregate budget across all agents in a repository). The right level depends on how predictable the agent's workload is and how closely cost is being tracked.

How this relates to Fleet

Fleet enforces a per-agent run-duration budget (budget_seconds), not a token budget — it caps cumulative wall-clock execution time rather than token count. Each agent entry can specify a model and a time budget. The fleet agent budget command shows utilization against the configured seconds limit; an agent that exceeds it is stopped and the event is logged to the audit trail. Fleet does not meter tokens directly.

Frequently asked questions

How do I set a reasonable token budget for a coding agent?

Measure first. Run the agent on a representative set of tasks without a budget and observe the distribution of token usage. Set the budget at roughly the 90th percentile of that distribution — high enough that legitimate tasks complete, low enough that outliers are caught. Adjust over time as the task mix changes.

What happens when an agent hits its token budget mid-task?

Best practice is to stop the agent cleanly, commit any in-progress work to a branch, and log the budget-exceeded event to the audit trail. The human operator or an orchestration layer can then decide whether to restart the agent with a one-time higher budget, split the task, or take over manually. Abandoning work without committing it is wasteful.

Run your first agent fleet

One binary. Five minutes. See every agent, coordinate every handoff, and keep a full audit trail of what your fleet did.