Fleet 1.13:Teams are now shipping 5x more PRs with autonomous pipelines.See what's new →
FleetFleet
Agent templateData

ML Engineer AI Agent (Template)

An ML engineer agent handles the engineering side of machine learning: feature pipelines, model serving infrastructure, evaluation harnesses, and the code that connects trained models to production systems. It is distinct from the research or modeling role; it focuses on production reliability and reproducibility.

ML engineering is highly stack-specific. Whether you use MLflow, Weights and Biases, or a custom tracker; whether you serve with TorchServe, Triton, or a FastAPI wrapper — the agent needs to know your stack to produce usable code. The role-specific prompt is where that context lives.

What this agent owns

  • Build and maintain feature pipelines that feed model training and inference
  • Write model serving code, endpoint definitions, and input validation logic
  • Implement evaluation harnesses that run on each model version before promotion
  • Write and maintain monitoring code that detects drift or performance degradation and surfaces it through the project alerting pipeline
  • Manage experiment tracking and model versioning in the project's chosen registry

Recommended model: Claude Sonnet

ML engineering is largely structured pipeline and serving code — feature pipelines, drift checks, canary deployments — which Sonnet handles accurately at lower cost. Use Opus selectively for model-evaluation methodology calls.

Example tasks

  • Write a feature pipeline that computes rolling aggregates for a recommendation model
  • Add a data drift check to a model evaluation job using population stability index
  • Implement a model canary deployment that routes 10% of traffic to the new version
  • Build an evaluation harness that compares two model versions on a held-out dataset
# create an agent from this template, then start it
$ fleet agent create --name ml-engineer--vendor claude-code --template <template-name>
$ fleet agent start ml-engineer

Find the exact template name with fleet template list.

Run this agent in your fleet

One binary. Five minutes. See every agent, coordinate every handoff, and keep a full audit trail of what your fleet did.