Metrics

Measurement unit: from “usage” to “task”

A customer using the product 5 times a day but failing every time is more at risk than one using it once a week with all successes — MAU, DAU, session length cannot express this difference. Agent product measurement must shift from “usage” to the task: did the agent accomplish what it was asked to do, how well, and how much business value it produced.

The structural reason: every task execution consumes real token and infrastructure cost (see economics/cost-model). More usage means higher marginal cost. A product with high MAU but low task completion rate is substantively worse than one with slightly lower MAU but higher task quality — a health alarm that traditional usage metrics cannot raise.

Three-layer framework

From coarsest to finest:

Volume layer

How much work the agent does.

Task initiation count — tasks initiated per user per month; reflects demand penetration
Task completion count — tasks that finished successfully; ratio to initiation gives completion rate

Quality layer

How well the agent does.

Completion rate — proportion of initiated tasks that ended successfully; direct measure of agent reasoning quality
HITL intervention rate — proportion of tasks requiring human fallback; lower means closer to full autonomy
Error recovery rate — proportion of failed tasks rescued by retry or fallback; agent resilience measure
First-pass accuracy — proportion of tasks correct on first execution

Value layer

What the agent contributes to the business.

Hours saved — human time displaced per task multiplied by task volume
Per-task value — human labor cost saved minus agent cost (the “saving per task” field from economics/controls-and-roi)
Customer ROI — customer-side total saving divided by total investment

Volume is the base, quality is the health signal, value closes the ROI loop. Missing any layer distorts decisions — for instance, tracking only volume without completion rate steers the team toward driving raw task count regardless of failure rate.

North-star selection

No single north-star fits all agent products — the choice depends heavily on business model. See north-star.

Unit economics specifics

Traditional SaaS LTV / CAC formulas do not fit agent products — non-zero marginal cost means gross margin must be amortized at the task level, and users at different usage intensity have vastly different margin structures. See unit-economics.

Cross-section connections

Value layer data depends on the cost decomposition in economics/cost-model
Volume and quality layer collection aligns with operations/dashboards
North-star design couples directly with the activation path in growth