Fleet 1.13:Teams are now shipping 5x more PRs with autonomous pipelines.See what's new →
FleetFleet
Glossary

Agent Compute Unit

An agent compute unit (ACU) is a normalized measure of the computational resources consumed by an AI agent during a task, combining token usage, tool call volume, and wall-clock execution time into a single comparable metric.

Comparing agent efficiency across different tasks is difficult when raw token counts are the only metric. A task that requires many short tool calls (file reads, test executions) uses fewer tokens than a task requiring long reasoning traces, but the wall-clock time and operational complexity may be similar. ACUs provide a composite metric that weights these dimensions.

The term originates with Devin (Cognition AI), whose ACU is a normalized billing unit — roughly 15 minutes of active autonomous work — combining VM time, model inference, and network use. Beyond Devin, the specific definition of an ACU varies by provider and tooling. Some systems define it as tokens multiplied by a complexity factor; others use a cost-normalized unit tied to billing rates. The important property is consistency: the same task should produce approximately the same ACU count across runs, making variance in ACU consumption a signal worth investigating.

For fleet operators, ACUs are useful for capacity planning (how many tasks can run in parallel given a compute budget), cost allocation (which projects or teams are consuming the most compute), and efficiency benchmarking (is agent B completing similar tasks with fewer ACUs than agent A).

How this relates to Fleet

Fleet tracks per-agent run time and reports token usage, which together approximate an ACU measure. The fleet agent budget command surfaces current utilization against configured limits. Fleet's per-agent model configuration allows operators to assign cheaper models to cost-sensitive roles and more capable models to complexity-intensive ones, optimizing the effective ACU per unit of output quality.

Frequently asked questions

How do I reduce agent compute unit consumption without reducing output quality?

The most effective approaches are: using a smaller model for tasks that do not require maximum capability (code review of simple changes, documentation updates), reducing context window size by providing only the most relevant file sections rather than entire files, and breaking large tasks into smaller subtasks that each require less context. Prompt efficiency — removing redundant instructions — also helps meaningfully.

Should I track ACUs per task or per agent over time?

Both are useful. Per-task ACUs let you identify which task types are expensive and whether a cheaper model can handle them. Per-agent ACUs over time reveal trends — a model update, a prompt change, or a shift in task mix often shows up as a step change in per-agent ACU consumption before it appears in quality metrics.

Run your first agent fleet

One binary. Five minutes. See every agent, coordinate every handoff, and keep a full audit trail of what your fleet did.