Autonomous Coding Agent

Name: Fleet
Author: Fleet

Modern autonomous coding agents combine a large language model with tool access: file read/write, terminal execution, browser access, and version control operations. Given a task like "add pagination to the user list endpoint," a capable agent will read the relevant source files, write the implementation, run the test suite, fix any failures, and open a pull request — all without step-by-step human guidance.

Autonomy exists on a spectrum. A lightly autonomous agent drafts code and waits for human approval at each step. A fully autonomous agent loops until the task is complete or it hits a configured budget or time limit. Most production deployments sit somewhere in the middle: autonomous execution within a bounded scope, with human review at key gates like pull request approval.

The main risks of autonomous coding agents are scope creep (the agent keeps changing things beyond the original task), silent failures (tests pass but the implementation is wrong), and runaway cost (many tool calls accumulate large token bills). Governance controls address all three.

How this relates to Fleet

Fleet treats autonomous coding agents as workers in a role-based team. Rather than running a single agent autonomously on an entire codebase, Fleet constrains each agent to a defined role with a specific scope, run-time budget (cumulative seconds), and risk model. When an agent's risk reaches critical, Fleet quarantines it before it can cause further damage; when it exceeds its run-time budget, Fleet stops it.

Frequently asked questions

Which AI models are used in autonomous coding agents?

Claude (Anthropic), GPT-4 and Codex (OpenAI), and Gemini (Google) are the most common model backends. The model is one component; the tooling harness — how the agent reads files, runs commands, and interacts with Git — matters as much as the model for real-world coding tasks.

How do you prevent an autonomous coding agent from breaking production?

The standard controls are: confine agent work to branches (never main), require pull request review before merge, enforce a test-pass gate, set a token budget to limit scope, and monitor the agent's risk score based on what it is changing. Approval gates at merge time provide a final human checkpoint before code reaches production.

Autonomous Coding Agent

How this relates to Fleet

Frequently asked questions

Which AI models are used in autonomous coding agents?

How do you prevent an autonomous coding agent from breaking production?

Related terms

Run your first agent fleet