Runtime Lifecycle

At a glance

A single agent.stream() call involves 3 API layers, 12 callbacks, 4 message chains, and 2 pairs of same-named callbacks. This page pins each one to the timeline.

Key number	Value
Total callbacks	12 (deduplicated)
Same-named callback pairs	2 (`onStepFinish` × 2 / `onFinish` × 2)
stopWhen default (L1)	`stepCountIs(20)`
stopWhen default (L2 `streamText`)	`stepCountIs(1)`
Timeout tiers	3 (`totalMs` / `stepMs` / `chunkMs`)
Pinned SDK version	`ai@6.0.134`

Three-layer capability matrix

Build the map before reading the timeline — for every parameter/callback, know which layer it belongs to:

	L1: `new ToolLoopAgent({...})`	L2: `agent.stream({...})`	L3: `result.toUIMessageStream({...})`
Role	Static config (define agent)	Single invocation (trigger one run)	Downstream consumer (transform result to UI stream)
Lifecycle	Constructed once, reused	Called once per invocation	Called once per invocation
Structural params	`id`, `model`, `instructions`, `tools`, `experimental_context`, `providerOptions`	`messages` / `prompt`, `abortSignal`, `timeout`	`originalMessages`, `generateMessageId`, `sendReasoning`, `sendSources`, `sendStart`, `sendFinish`
Behavior hooks	`stopWhen`, `prepareStep`, `prepareCall`	`experimental_transform`	—
Callbacks (firing order)	`experimental_onStart` → `prepareStep` → `experimental_onStepStart` → `experimental_onToolCallStart` → `experimental_onToolCallFinish` → `onStepFinish` → `onFinish`	Same as L1 (merged with L1 same-named callbacks, settings fires first)	`messageMetadata` → `onStepFinish` → `onFinish` → `onError`

L1 and L2 same-named callbacks are merged: if both layers set onStepFinish, L1’s fires first, then L2’s (source: dist/index.js:8224-8232). L3’s same-named callbacks are entirely independent — they don’t merge with L1/L2 and have different payloads.

L3’s other entry point: this page’s L3 column focuses on the result.toUIMessageStream() transform path (passively consuming the agent’s result). L3 has another entry point — createUIMessageStream({ execute }) — the execute-driven path, which can actively push custom events and merge multiple streams alongside the agent’s. Both share the same underlying handleUIMessageStreamFinish (index.js:8100 / :8397), so onStepFinish / onFinish / onError fire at the same moments described on this page; but messageMetadata exists only on toUIMessageStream. The execute-driven path is covered in detail in UI Stream Orchestration.

Full timeline diagram

A single N-step agent.stream() call, time flows downward:

Three critical observations:

L1/L2 callbacks and L3 callbacks run concurrently — L2 pushes chunks to fullStream while L3’s pipe transforms them. So L1/L2 onStepFinish(n) and L3 onStepFinish(n) happen nearly simultaneously, but as independent event-loop tasks.
L3 onFinish always fires later than L1/L2 onFinish — L3 is a downstream transform, it must wait for fullStream close + consumer drain before flushing. For “after-run” work: use L1/L2 onFinish for engine-side cleanup (token tallying, sandbox close), L3 onFinish for UI-side persistence (saving the assistant message).
messageMetadata runs per chunk — including every text-delta and tool-input-delta. A long multi-tool response can easily emit 1000+ chunks; any synchronous I/O here directly stalls the stream.

Callback firing reference

In firing order. L1 = ToolLoopAgent settings, L2 = streamText (direct pass-through from agent.stream), L3 = toUIMessageStream.

#	Callback	Layer	When	Payload	Use for
1	`prepareCall(baseCallArgs)`	L1	Before each `agent.stream()` starts, after params merge	Full call args, returns overrides	Dynamic model/tools/stopWhen rewrite
2	`experimental_onStart()`	L1+L2	After streamText starts, before first step	None	Init logging/timing
3	`prepareStep({...})`	L1+L2	Before each step’s model call	`{ messages, steps, stepNumber, model }`	Compaction, reminder injection, activeTools filtering, model switching
4	`experimental_onStepStart()`	L1+L2	Before each step’s model stream (after `prepareStep`)	None	Per-step timing marker
5	`experimental_onToolCallStart()`	L1+L2	Before each `tool.execute`	`{ toolCall }`	Permission audit, pre-retry logic
6	`experimental_onToolCallFinish()`	L1+L2	After each `tool.execute`	`{ toolCall, toolResult }`	Observability, cache writeback
7	`onStepFinish(stepResult)`	L1+L2	Each step end, after `finish-step` chunk emitted	`StepResult`: full step detail	Token tallying, step-level persistence
8	`messageMetadata({ part })`	L3	Each chunk passing through UI transform	`{ part }` (current chunk)	Attach metadata to UI control chunks
9	`onStepFinish`	L3	Each `finish-step` chunk passing through UI transform	`{ responseMessage, messages, isContinuation }`	UI-side step-level persistence
10	`onFinish({...})`	L1+L2	After all steps done, after `finish` chunk emitted	`{ finishReason, totalUsage, steps, ... }`	Engine-side settlement, cleanup
11	`onFinish({...})`	L3	After UI stream drain / cancel	`{ responseMessage, messages, isContinuation, isAborted, finishReason }`	UI-side message persistence
12	`onError(error)`	L3	UI transform error / error chunk / `onStepFinish` throw	`Error` or string	SSE error serialization; the returned string is written into the error chunk’s `errorText` field sent to the client

onFinish throws do NOT route here: callOnFinish (index.js:5927-5943) is a bare await with no try/catch. An onFinish throw propagates out through TransformStream’s flush(), rejecting the consumer iterator — it does not invoke onError. Any production onFinish must wrap its own try/catch/finally inside the callback. Full error-capture tiering in UI Stream Orchestration — Error capture, in full.

Same-named callbacks — the biggest trap

`onStepFinish`: L1/L2 vs L3

	L1/L2 (streamText)	L3 (UI stream)
When	Step loop ends, after `finish-step` emit	`finish-step` chunk through UI transform
Payload	`StepResult`: `{ stepNumber, content, text, toolCalls, toolResults, finishReason, usage, response, request, ... }`	`{ responseMessage: UIMessage, messages: UIMessage[], isContinuation }`
What you see	Engine view: raw step output (tool call objects, usage breakdown)	Consumer view: accumulated UI message (assistant message structure)
Use for	Token tallying (billing), step-level logging, driving compaction / context trimming	Incremental UI message persistence, prefetching

`onFinish`: L1/L2 vs L3

	L1/L2 (streamText)	L3 (UI stream)
When	After `finish` chunk emit, before fullStream close	After fullStream close + UI transform flush
Payload	`{ finishReason, totalUsage, steps, content, text, reasoningText, toolCalls, toolResults, response, request, warnings, providerMetadata }`	`{ responseMessage, messages, isContinuation, isAborted, finishReason }`
Ordering	Earlier (upstream)	Later (downstream drain)
Use for	Engine-level one-shot settlement: write total usage to DB, close sandbox, commit compaction checkpoint	UI-level one-shot settlement: persist final assistant message, notify client of completion

L1/L2 onFinish closure trap: the L1/L2 onFinish payload carries the entire steps array — every step’s full StepResult (content, toolCalls, toolResults, request, response, all of it). If your callback closure captures this payload and pins it on a long-lived reference (e.g. storing it on an outer session object), the entire large object graph from this invocation is held and never GC’d. Long chat conversations amplify this — 20 steps of cumulative StepResult easily reaches hundreds of MB.

Practical guidance:

Short settlement logic (token counting, step logs) can live in L1/L2 onFinish — closure releases right after the callback returns
Long settlement logic (persistence, background jobs, checkpoint writes) should prefer L3 onFinish; its payload is the folded responseMessage + messages, orders of magnitude smaller
Or: use L1/L2 onStepFinish to incrementally collect only what you need into a small variable (numbers / strings / ids only, never the StepResult itself), then settle on that small variable in L3 onFinish

stopWhen default chain — the second trap

Same parameter name, different defaults at L1 and L2:

L1 new ToolLoopAgent({ stopWhen? })  default: stepCountIs(20)   ← dist/index.js:8210
L2 streamText({ stopWhen? })         default: stepCountIs(1)    ← dist/index.js:6452

ToolLoopAgent.stream() forwards L1’s stopWhen (default 20) to streamText, so the normal path caps the agent at 20 steps.

But if you call streamText(...) directly without setting stopWhen, your agent runs for exactly one step — one tool call and it halts. Classic beginner trap.

Built-in stop conditions (combinable: stopWhen: [stepCountIs(N), hasToolCall('complete')]):

Factory	Semantics
`stepCountIs(N)`	Stop after reaching step N
`hasToolCall(name)`	Stop after a tool call with the given name

Typical production combo: [stepCountIs(N), hasToolCall('complete')] — the number is a hard ceiling (prevents the agent from spinning in a loop), hasToolCall('complete') is the “task finished” signal (lets the agent declare its own end). Pick N based on task complexity: simple Q&A 10-20, general assistant 30-50, deep research / multi-step editing 50-100.

Three-tier timeout — the third trap

timeout is an object with three granularities:

agent.stream({
  timeout: {
    totalMs: 600_000, // Entire invocation: 10 min
    stepMs: 300_000, // Single step: 5 min (model + tools combined)
    chunkMs: 120_000, // Gap between two chunks: 2 min
  },
});

All three are independent. Any one tripping aborts via AbortSignal.timeout() (dist/index.js:6483-6495).

Dimension	Watches for	Typical scenario
`totalMs`	Absolute call duration	Long-task total ceiling
`stepMs`	Single step from prepareStep to finish-step	Model response stalls
`chunkMs`	Gap between two adjacent chunks	Mid-stream hang (TCP half-open, slow provider thinking phase)

Tutorials usually only mention totalMs — but chunkMs is the lifesaver in production. A model stream can emit two lines then hang (TCP half-open from the provider, a thinking phase taking too long, etc.); totalMs is nowhere near, but chunkMs terminates it immediately. A solid default in production is chunkMs: 120_000 (2 minutes) — enough to cover long reasoning-model thinking gaps, but not so long that a real disconnect quietly hangs forever.

`prepareCall` — the little-known upstream hook

Besides prepareStep (per-step), L1 also has prepareCall (per-invocation):

new ToolLoopAgent({
  model,
  instructions,
  tools,
  prepareCall: async (baseCallArgs) => {
    // baseCallArgs = all merged args (settings + method options)
    // Return overrides — or return nothing to use baseCallArgs as-is
    return {
      ...baseCallArgs,
      tools: dynamicallyDecideTools(baseCallArgs.messages),
      stopWhen: stepCountIs(deriveStepLimit(user)),
    };
  },
});

When to use:

Dynamically swap tool sets (user tier / A-B test) without rebuilding the agent instance
Pick a model based on the call’s input
Set stopWhen per call

prepareCall vs prepareStep:

	`prepareCall`	`prepareStep`
Layer	L1	L1 or L2
Invocation count	Once per `agent.stream()`	Once per step (N times per invocation)
Can override	All call args (tools, stopWhen, instructions, messages, model, …)	Per-step args (messages, system, model, toolChoice, activeTools)
Use for	Static config dynamization	Runtime context adaptation (compaction, reminders, dynamic tool sets)

Picking between them: most projects only need prepareStep — runtime compaction, per-step reminder injection, swapping tool sets based on conversation length, these are all per-step scenarios. prepareCall makes more sense for deployments where the same agent instance is reused across requests (the agent is constructed once at module top level, and each HTTP request rewrites call args based on user tier / AB experiments). If your agent is reconstructed per request (the orchestrator layer already does config dynamization), prepareCall is redundant.

`experimental_*` callbacks — four underrated hooks

Officially prefixed experimental_, meaning the signatures might change; but in the ai@6.0.x series they’re stable (this chapter anchors on 6.0.134 — recheck the signatures when upgrading to 6.1+) and are core building blocks for observable agents:

Hook	Use for	Typical landing scenarios
`experimental_onStart`	Call-level “begin” marker	Emit a UI init event, start the whole-call timer, log run start
`experimental_onStepStart`	Step-level “begin” marker	Reset step counter, clear per-step buffer, start per-step timer
`experimental_onToolCallStart`	Tool-level “begin” marker	Audit log, permission check, blocking validation
`experimental_onToolCallFinish`	Tool-level “done” marker	Result caching, metric reporting, tool-level error routing

Difference from onStepFinish: onStepFinish is the aggregation callback at step end (full StepResult); experimental_onStepStart is the transition point at step start (no payload, pure signal). For “inter-step cleanup” use the former; for “step initialization” use the latter.

At a glance

Three-layer capability matrix

Full timeline diagram

Callback firing reference

Same-named callbacks — the biggest trap

onStepFinish: L1/L2 vs L3

onFinish: L1/L2 vs L3

stopWhen default chain — the second trap

Three-tier timeout — the third trap

prepareCall — the little-known upstream hook

experimental_* callbacks — four underrated hooks

Further reading

`onStepFinish`: L1/L2 vs L3

`onFinish`: L1/L2 vs L3

`prepareCall` — the little-known upstream hook

`experimental_*` callbacks — four underrated hooks