Agent Token Budget

Name: Fleet
Author: Fleet

Token usage is the primary cost driver for AI agents. A developer agent completing a complex feature might use 500,000 tokens in a session; if it gets stuck in a loop or tackles an unexpectedly large task, that number can grow by an order of magnitude without a budget constraint. At typical API pricing, this translates directly to unexpected billing.

Budgets also serve a safety function beyond cost control. An agent that exceeds its token budget is either working on something much larger than expected (possibly scope creep) or stuck in a loop. Either way, it is a signal worth investigating. A budget limit forces that investigation to happen before costs become significant.

Budgets can be set at multiple levels: per-session (stop when this task exceeds N tokens), per-day (cap this agent's daily consumption), or per-project (aggregate budget across all agents in a repository). The right level depends on how predictable the agent's workload is and how closely cost is being tracked.

How this relates to Fleet

Fleet enforces a per-agent run-duration budget (budget_seconds), not a token budget — it caps cumulative wall-clock execution time rather than token count. Each agent entry can specify a model and a time budget. The fleet agent budget command shows utilization against the configured seconds limit; an agent that exceeds it is stopped and the event is logged to the audit trail. Fleet does not meter tokens directly.

Frequently asked questions

How do I set a reasonable token budget for a coding agent?

Measure first. Run the agent on a representative set of tasks without a budget and observe the distribution of token usage. Set the budget at roughly the 90th percentile of that distribution — high enough that legitimate tasks complete, low enough that outliers are caught. Adjust over time as the task mix changes.

What happens when an agent hits its token budget mid-task?

Best practice is to stop the agent cleanly, commit any in-progress work to a branch, and log the budget-exceeded event to the audit trail. The human operator or an orchestration layer can then decide whether to restart the agent with a one-time higher budget, split the task, or take over manually. Abandoning work without committing it is wasteful.

Agent Token Budget

How this relates to Fleet

Frequently asked questions

How do I set a reasonable token budget for a coding agent?

What happens when an agent hits its token budget mid-task?

Related terms

Run your first agent fleet