Runtime Lifecycle
The full timeline of a single agent.stream() call — 12 callback firing points across three layers, two-layer same-named callback comparison, stopWhen / timeout default chains
At a glance
A single agent.stream() call involves 3 API layers, 12 callbacks, 4 message chains, and 2 pairs of same-named
callbacks. This page pins each one to the timeline.
| Key number | Value |
|---|---|
| Total callbacks | 12 (deduplicated) |
| Same-named callback pairs | 2 (onStepFinish × 2 / onFinish × 2) |
| stopWhen default (L1) | stepCountIs(20) |
stopWhen default (L2 streamText) | stepCountIs(1) |
| Timeout tiers | 3 (totalMs / stepMs / chunkMs) |
| Pinned SDK version | ai@6.0.134 |
Three-layer capability matrix
Build the map before reading the timeline — for every parameter/callback, know which layer it belongs to:
L1: new ToolLoopAgent({...}) | L2: agent.stream({...}) | L3: result.toUIMessageStream({...}) | |
|---|---|---|---|
| Role | Static config (define agent) | Single invocation (trigger one run) | Downstream consumer (transform result to UI stream) |
| Lifecycle | Constructed once, reused | Called once per invocation | Called once per invocation |
| Structural params | id, model, instructions, tools, experimental_context, providerOptions | messages / prompt, abortSignal, timeout | originalMessages, generateMessageId, sendReasoning, sendSources, sendStart, sendFinish |
| Behavior hooks | stopWhen, prepareStep, prepareCall | experimental_transform | — |
| Callbacks (firing order) | experimental_onStart → prepareStep → experimental_onStepStart → experimental_onToolCallStart → experimental_onToolCallFinish → onStepFinish → onFinish | Same as L1 (merged with L1 same-named callbacks, settings fires first) | messageMetadata → onStepFinish → onFinish → onError |
L1 and L2 same-named callbacks are merged: if both layers set onStepFinish, L1’s fires first, then L2’s (source:
dist/index.js:8224-8232). L3’s same-named callbacks are entirely independent — they don’t merge with L1/L2 and
have different payloads.
L3’s other entry point: this page’s L3 column focuses on the
result.toUIMessageStream()transform path (passively consuming the agent’s result). L3 has another entry point —createUIMessageStream({ execute })— the execute-driven path, which can actively push custom events and merge multiple streams alongside the agent’s. Both share the same underlyinghandleUIMessageStreamFinish(index.js:8100/:8397), soonStepFinish/onFinish/onErrorfire at the same moments described on this page; butmessageMetadataexists only ontoUIMessageStream. The execute-driven path is covered in detail in UI Stream Orchestration.
Full timeline diagram
A single N-step agent.stream() call, time flows downward:
Three critical observations:
- L1/L2 callbacks and L3 callbacks run concurrently — L2 pushes chunks to
fullStreamwhile L3’s pipe transforms them. So L1/L2onStepFinish(n)and L3onStepFinish(n)happen nearly simultaneously, but as independent event-loop tasks. - L3
onFinishalways fires later than L1/L2onFinish— L3 is a downstream transform, it must wait for fullStream close + consumer drain before flushing. For “after-run” work: use L1/L2onFinishfor engine-side cleanup (token tallying, sandbox close), L3onFinishfor UI-side persistence (saving the assistant message). messageMetadataruns per chunk — including everytext-deltaandtool-input-delta. A long multi-tool response can easily emit 1000+ chunks; any synchronous I/O here directly stalls the stream.
Callback firing reference
In firing order. L1 = ToolLoopAgent settings, L2 = streamText (direct pass-through from agent.stream),
L3 = toUIMessageStream.
| # | Callback | Layer | When | Payload | Use for |
|---|---|---|---|---|---|
| 1 | prepareCall(baseCallArgs) | L1 | Before each agent.stream() starts, after params merge | Full call args, returns overrides | Dynamic model/tools/stopWhen rewrite |
| 2 | experimental_onStart() | L1+L2 | After streamText starts, before first step | None | Init logging/timing |
| 3 | prepareStep({...}) | L1+L2 | Before each step’s model call | { messages, steps, stepNumber, model } | Compaction, reminder injection, activeTools filtering, model switching |
| 4 | experimental_onStepStart() | L1+L2 | Before each step’s model stream (after prepareStep) | None | Per-step timing marker |
| 5 | experimental_onToolCallStart() | L1+L2 | Before each tool.execute | { toolCall } | Permission audit, pre-retry logic |
| 6 | experimental_onToolCallFinish() | L1+L2 | After each tool.execute | { toolCall, toolResult } | Observability, cache writeback |
| 7 | onStepFinish(stepResult) | L1+L2 | Each step end, after finish-step chunk emitted | StepResult: full step detail | Token tallying, step-level persistence |
| 8 | messageMetadata({ part }) | L3 | Each chunk passing through UI transform | { part } (current chunk) | Attach metadata to UI control chunks |
| 9 | onStepFinish | L3 | Each finish-step chunk passing through UI transform | { responseMessage, messages, isContinuation } | UI-side step-level persistence |
| 10 | onFinish({...}) | L1+L2 | After all steps done, after finish chunk emitted | { finishReason, totalUsage, steps, ... } | Engine-side settlement, cleanup |
| 11 | onFinish({...}) | L3 | After UI stream drain / cancel | { responseMessage, messages, isContinuation, isAborted, finishReason } | UI-side message persistence |
| 12 | onError(error) | L3 | UI transform error / error chunk / onStepFinish throw | Error or string | SSE error serialization; the returned string is written into the error chunk’s errorText field sent to the client |
onFinishthrows do NOT route here:callOnFinish(index.js:5927-5943) is a bareawaitwith no try/catch. An onFinish throw propagates out through TransformStream’sflush(), rejecting the consumer iterator — it does not invokeonError. Any productiononFinishmust wrap its own try/catch/finally inside the callback. Full error-capture tiering in UI Stream Orchestration — Error capture, in full.
Same-named callbacks — the biggest trap
onStepFinish: L1/L2 vs L3
| L1/L2 (streamText) | L3 (UI stream) | |
|---|---|---|
| When | Step loop ends, after finish-step emit | finish-step chunk through UI transform |
| Payload | StepResult: { stepNumber, content, text, toolCalls, toolResults, finishReason, usage, response, request, ... } | { responseMessage: UIMessage, messages: UIMessage[], isContinuation } |
| What you see | Engine view: raw step output (tool call objects, usage breakdown) | Consumer view: accumulated UI message (assistant message structure) |
| Use for | Token tallying (billing), step-level logging, driving compaction / context trimming | Incremental UI message persistence, prefetching |
onFinish: L1/L2 vs L3
| L1/L2 (streamText) | L3 (UI stream) | |
|---|---|---|
| When | After finish chunk emit, before fullStream close | After fullStream close + UI transform flush |
| Payload | { finishReason, totalUsage, steps, content, text, reasoningText, toolCalls, toolResults, response, request, warnings, providerMetadata } | { responseMessage, messages, isContinuation, isAborted, finishReason } |
| Ordering | Earlier (upstream) | Later (downstream drain) |
| Use for | Engine-level one-shot settlement: write total usage to DB, close sandbox, commit compaction checkpoint | UI-level one-shot settlement: persist final assistant message, notify client of completion |
L1/L2 onFinish closure trap: the L1/L2 onFinish payload carries the entire steps array — every step’s full
StepResult (content, toolCalls, toolResults, request, response, all of it). If your callback closure
captures this payload and pins it on a long-lived reference (e.g. storing it on an outer session object), the entire
large object graph from this invocation is held and never GC’d. Long chat conversations amplify this — 20 steps of
cumulative StepResult easily reaches hundreds of MB.
Practical guidance:
- Short settlement logic (token counting, step logs) can live in L1/L2
onFinish— closure releases right after the callback returns - Long settlement logic (persistence, background jobs, checkpoint writes) should prefer L3
onFinish; its payload is the foldedresponseMessage+messages, orders of magnitude smaller - Or: use L1/L2
onStepFinishto incrementally collect only what you need into a small variable (numbers / strings / ids only, never theStepResultitself), then settle on that small variable in L3onFinish
stopWhen default chain — the second trap
Same parameter name, different defaults at L1 and L2:
L1 new ToolLoopAgent({ stopWhen? }) default: stepCountIs(20) ← dist/index.js:8210
L2 streamText({ stopWhen? }) default: stepCountIs(1) ← dist/index.js:6452
ToolLoopAgent.stream() forwards L1’s stopWhen (default 20) to streamText, so the normal path caps the agent at
20 steps.
But if you call streamText(...) directly without setting stopWhen, your agent runs for exactly one step — one
tool call and it halts. Classic beginner trap.
Built-in stop conditions (combinable: stopWhen: [stepCountIs(N), hasToolCall('complete')]):
| Factory | Semantics |
|---|---|
stepCountIs(N) | Stop after reaching step N |
hasToolCall(name) | Stop after a tool call with the given name |
Typical production combo: [stepCountIs(N), hasToolCall('complete')] — the number is a hard ceiling (prevents
the agent from spinning in a loop), hasToolCall('complete') is the “task finished” signal (lets the agent declare
its own end). Pick N based on task complexity: simple Q&A 10-20, general assistant 30-50, deep research / multi-step
editing 50-100.
Three-tier timeout — the third trap
timeout is an object with three granularities:
agent.stream({
timeout: {
totalMs: 600_000, // Entire invocation: 10 min
stepMs: 300_000, // Single step: 5 min (model + tools combined)
chunkMs: 120_000, // Gap between two chunks: 2 min
},
});
All three are independent. Any one tripping aborts via AbortSignal.timeout() (dist/index.js:6483-6495).
| Dimension | Watches for | Typical scenario |
|---|---|---|
totalMs | Absolute call duration | Long-task total ceiling |
stepMs | Single step from prepareStep to finish-step | Model response stalls |
chunkMs | Gap between two adjacent chunks | Mid-stream hang (TCP half-open, slow provider thinking phase) |
Tutorials usually only mention totalMs — but chunkMs is the lifesaver in production. A model stream can emit
two lines then hang (TCP half-open from the provider, a thinking phase taking too long, etc.); totalMs is nowhere
near, but chunkMs terminates it immediately. A solid default in production is chunkMs: 120_000 (2 minutes) —
enough to cover long reasoning-model thinking gaps, but not so long that a real disconnect quietly hangs forever.
prepareCall — the little-known upstream hook
Besides prepareStep (per-step), L1 also has prepareCall (per-invocation):
new ToolLoopAgent({
model, instructions, tools,
prepareCall: async (baseCallArgs) => {
// baseCallArgs = all merged args (settings + method options)
// Return overrides — or return nothing to use baseCallArgs as-is
return {
...baseCallArgs,
tools: dynamicallyDecideTools(baseCallArgs.messages),
stopWhen: stepCountIs(deriveStepLimit(user)),
};
},
});
When to use:
- Dynamically swap tool sets (user tier / A-B test) without rebuilding the agent instance
- Pick a model based on the call’s input
- Set stopWhen per call
prepareCall vs prepareStep:
prepareCall | prepareStep | |
|---|---|---|
| Layer | L1 | L1 or L2 |
| Invocation count | Once per agent.stream() | Once per step (N times per invocation) |
| Can override | All call args (tools, stopWhen, instructions, messages, model, …) | Per-step args (messages, system, model, toolChoice, activeTools) |
| Use for | Static config dynamization | Runtime context adaptation (compaction, reminders, dynamic tool sets) |
Picking between them: most projects only need prepareStep — runtime compaction, per-step reminder injection,
swapping tool sets based on conversation length, these are all per-step scenarios. prepareCall makes more sense for
deployments where the same agent instance is reused across requests (the agent is constructed once at module top
level, and each HTTP request rewrites call args based on user tier / AB experiments). If your agent is reconstructed
per request (the orchestrator layer already does config dynamization), prepareCall is redundant.
experimental_* callbacks — four underrated hooks
Officially prefixed experimental_, meaning the signatures might change; but in ai@^6 they’re stable and are core
building blocks for observable agents:
| Hook | Use for | Typical landing scenarios |
|---|---|---|
experimental_onStart | Call-level “begin” marker | Emit a UI init event, start the whole-call timer, log run start |
experimental_onStepStart | Step-level “begin” marker | Reset step counter, clear per-step buffer, start per-step timer |
experimental_onToolCallStart | Tool-level “begin” marker | Audit log, permission check, blocking validation |
experimental_onToolCallFinish | Tool-level “done” marker | Result caching, metric reporting, tool-level error routing |
Difference from onStepFinish: onStepFinish is the aggregation callback at step end (full StepResult);
experimental_onStepStart is the transition point at step start (no payload, pure signal). For “inter-step
cleanup” use the former; for “step initialization” use the latter.
Further reading
Related SDK chapters
- Message Reference Model — the initialMessages / responseMessages reference semantics referenced throughout
- prepareStep Semantics — deep dive on the most-used hook
- UI Stream Orchestration — the L3
createUIMessageStreamexecute-driven path- the full error-capture tiering
SDK source anchors (ai@6.0.134)
dist/index.js:6441-6532—streamTextentrydist/index.js:6750-6810—onStepFinishemissiondist/index.js:8180-8317—ToolLoopAgentimplementationdist/index.js:7839-8108—toUIMessageStreamimplementation
Zapvol landing reference
packages/backend/src/agent/agent-stream.ts— the assembly point for all-layer callbacks (L1/L2 settings, L3toUIMessageStreamparams,stepUsagesincremental-collection pattern)packages/backend/src/agent/agent-factory.ts— how thestopConditionsarray is constructedapps/server/src/services/task-orchestrator.ts— L3onFinishused for assistant message persistence