AI SDK Internals

Why this chapter exists

Zapvol’s agent engine is built on Vercel AI SDK. On the surface we only call three APIs: new ToolLoopAgent(...), agent.stream(...), result.toUIMessageStream(...). But the details that actually determine whether an agent behaves correctly are buried in these three layers’ internal state models and callback ordering.

The official docs alone don’t cut it. Four recurring questions in the community all trace back to missing this foundation:

“I modified messages in prepareStep, why does the change persist on the next step?” (shared object references, not deep copies)
“Why did my onStepFinish / onFinish fire twice?” (there are actually two same-named callbacks — one at the streamText layer, one at the UI stream layer)
“I didn’t set stopWhen, why does my agent stop after one step?” (ToolLoopAgent and streamText have different defaults — the fallback chain has a trap)
“Why does messageMetadata run so often in my UI layer?” (it’s invoked for every chunk)

None of these are bugs — they’re the SDK’s design semantics. Until you understand them, you’re debugging by trial and error. Once you do, each one becomes an engineering lever you can exploit.

The goal of this chapter: explain these mechanisms once, thoroughly, so you stop guessing when writing agents.

Version pinning

This chapter is based on ai@6.0.134 (Zapvol’s pinned version — see packages/backend/package.json). Every source reference, line number, and callback signature maps to this version. A v7 migration page will be opened when there are roadmap changes.

ai@^6
├── streamText / generateText              ← low-level execution
├── ToolLoopAgent                           ← agent-style wrapper over streamText
└── createUIMessageStream / toUIMessageStream  ← downstream UI consumption

Three-layer API map

Every “why does it work this way” explanation comes back to this map. Locating which layer a given parameter or callback belongs to answers half of all questions.

Layer	API	Responsibility	Lifecycle
L1	`new ToolLoopAgent({ ... })`	Static agent config: model, instructions, tools, stopWhen, prepareStep, 6 callbacks	Constructed once, reused across `stream()` / `generate()` calls
L2	`agent.stream({ ... })`	Per-invocation params: messages, abortSignal, timeout — and overrides of the L1 same-named callbacks	Called once per invocation; forwards to `streamText`
L3	`result.toUIMessageStream({ ... })`	Transforms `fullStream` into a UI message stream (SSE event sequence); has its own independent messageMetadata / onFinish / onError	Called once per invocation; downstream transform, runs concurrently with L1/L2 callbacks

Critical: L3’s onFinish and L1/L2’s onFinish are not the same thing — different firing time, different payload, different purpose. Confusing them directly causes token-counting drift and misordered message persistence. Full comparison in Lifecycle.

How to read the six pages

Read in the order “send-side skeleton → hooks → receive-side → end-to-end”. First nail down the timeline of a single agent.stream() call (three layers × 12 callbacks); then drill down into L3’s two UI-stream paths, message references inside the step loop, and the deepest per-step hook; hop to the receive side and see how useChat consumes the stream; finally use “end-to-end” to stitch pages 1-5 together. Pages 1-4 are the send side, page 5 is the receive side, page 6 puts both sides into one picture.

Order	Page	Which runtime layer
1	Runtime Lifecycle	The skeleton — 12 callbacks across the three API layers (L1 ToolLoopAgent / L2 streamText / L3 toUIMessageStream) on a single timeline; two-layer same-named callbacks; stopWhen / timeout defaults
2	UI Stream Orchestration	L3 deep-dive / send side — Lifecycle’s L3 only covers the transform path (`result.toUIMessageStream()`); this page adds `createUIMessageStream({ execute })`’s execute-driven path, the three writer methods, the custom data-* event protocol, transient semantics, and the full error-capture table
3	Message Reference Model	Message mechanics inside the step loop — the four message chains (initialMessages / responseMessages / stepInputMessages / prepareStepResult.messages) and their reference semantics
4	prepareStep Semantics	The deepest per-step hook — overridable fields, typical patterns, the mutate-vs-push trap
5	Client Consumption (useChat)	The receive side — `@ai-sdk/react`’s `useChat` turns the UI stream into React state; the four `ChatStatus` states, the four `ChatTransport` choices, `UIMessage` vs `ModelMessage` bridging, the three-way `onFinish` flag split, the client tool state machine, the two-sided `resume` protocol, 10 common traps
6	End-to-End Coordination	Both sides together — the full round-trip of one `sendMessage` (sequence diagram + each hop’s source location), the 25-type `UIMessageChunk` reference, implementation templates for the three two-sided protocols (abort / resume / error), SSE environment-layer traps

After these six, you should be able to answer the four opening questions, plus the client-side ones (“why is status stuck on submitted”, “why are messages still there after onError”, “why isn’t the tool call UI rendering”), and the two-sided ones (“client stopped but server is still running”, “how does resume actually land”, “why is SSE stuck behind the reverse proxy”). If not, come back and locate the answer on the map.

Not in scope

Not an AI SDK tutorial — readers are assumed to know the basics of streamText, generateText, and ToolLoopAgent. For basics, see the official docs.
UI framework bindings: only useChat is deep-dived (page 5 Client Consumption) — the other @ai-sdk/react hooks (useCompletion, useObject) aren’t covered. Zapvol itself does not use useChat for the main conversation (custom React Query + SSE hook instead; Case study at the end of page 5) but does use useCompletion for the one-shot AI Assistant side panel.
Not covering providers (@ai-sdk/anthropic etc.) — provider choice is a Zapvol-layer decision; see Agent Engine.
Not an SDK contribution guide — only discusses the internals an application-layer consumer needs to know.

Why this isn’t in the official docs

Vercel’s docs cover “how to use”; this chapter covers “why this way works”. A precise division:

Official docs	This chapter
API signatures, parameter lists, code examples	Callback firing order, reference semantics, default fallback chains, counter-intuitive traps
Teaches you to write your first agent	Teaches you to debug your third-failing agent
Organized by topic (generating text / tools / streaming)	Organized by actual execution order (three layers × timeline)