Fleet 1.17.0 is out.See what's new →
FleetFleet
Use case

AI Workflows for QA Test Plans

Test plans are where QA time goes to die: hours transcribing a feature spec into cases, permutations, and edge conditions — clerical work that must be done carefully and is therefore done slowly. When the sprint is tight, the plan gets thinner exactly when the feature is riskiest.

And plans written by one person have one person's blind spots. The reviewer who would catch the missing negative-path cases rarely exists; QA reviews everyone else's work and nobody reviews QA's.

How it works with an agent fleet

A Fleet workflow fans out over your feature specs — one draft per spec — then runs a coverage review against each and gates the result on the QA lead's approval.

genflows:
  - name: test-plans
    steps:
      - {name: plan, prompt: "Write a test plan for this spec: happy paths, negative paths, edge conditions, and data permutations.", corpus: ["specs/**/*.md"], for_each: "specs/**/*.md", kind: report, out: plan.md}
      - {name: coverage, prompt: "Review the plan against the spec. Flag untested requirements and missing negative paths.", depends_on: [plan], kind: review, out: gaps.md}
      - {name: qa-lead-ok, depends_on: [plan, coverage], kind: approval, out: decision.md}

The for_each glob maps the draft step over every matching spec file — ten specs produce ten plans, each with its own artifact. The coverage review is the missing reviewer-for-QA: a separate pass whose only job is finding what the plan didn't cover.

The fleet pattern

Fan-out over specs → per-spec plan → coverage review → QA lead approval. The clerical transcription is automated; the judgment (is this coverage adequate for this risk?) stays with the lead, at the gate.

Guardrails that matter here

  • The coverage review reads the original spec, not just the plan — 'requirement 4.2 has no test case' is exactly the flag it exists to raise
  • Approval records which plan revision the lead accepted; when a bug ships, the plan that missed it is on file, reviewable, and improvable
  • Incremental rebuild: unchanged specs skip on re-run, so refreshing plans after one spec changes costs one spec's work

Who this is for

QA leads who own test planning across a team, especially where specs already live as files in a repository and plan quality varies with deadline pressure.

Frequently asked questions

Does it execute the tests too?

This workflow produces the plan — the thinking document. Test execution is a different Fleet capability (QA agents in the dev chain run suites against PRs). Teams use both: the workflow for planning rigor, agents for execution.

What if our specs are in a tracker, not in files?

The corpus is file-based — globs over a repository. Teams export or mirror specs into the repo (many already do for versioning). A label-triggered run can also be seeded from a ticket directly.

Run your first agent fleet

One binary. Five minutes. See every agent, coordinate every handoff, and keep a full audit trail of what your fleet did.