Message Reference Model

Why this page is second

The vast majority of “I modified messages but it snapped back” or “I thought the change was temporary but it permanently polluted the conversation” issues trace back to the four message chains the AI SDK maintains internally and the shallow reference relationships between them — which this page covers.

After reading, you can answer:

“I .push’d a message in prepareStep, will it still be there next step?”
“Where does stepInputMessages come from? Is it rebuilt each step?”
“Is it safe to modify responseMessages?”
“Where does prepareStepResult.messages get written back to after the step?”

The four message chains

The anchors below point to the streamText path in ai@6.0.134 (dist/index.js:7030-7630). The generateText path has identical structure but different line numbers (corresponding section at 4210-4640); the semantics quoted below apply to both paths.

1. `initialMessages` — the input, unchanged for the whole invocation

const initialMessages = initialPrompt.messages; // dist:7036

Origin: the array passed to agent.stream({ messages }), stored after standardizePrompt normalization.

Lifetime: unchanged for the entire agent.stream() invocation. The SDK does not push new elements into it.

Identity: the array reference is fixed; the inner message object references are also fixed — this is the root of all the traps that follow.

2. `responseMessages` — accumulated during the loop, assistant / tool only

// Initialized as empty
const initialResponseMessages = []; // dist:7037
// ... at step end, before the next step begins ...
responseMessages.push(...(await toResponseMessages({ content, tools }))); // dist:7623

Origin: the SDK converts each step’s content to role: "assistant" + role: "tool" messages via the toResponseMessages(...) helper, then pushes them to the accumulator.

Critical constraint: toResponseMessages only produces role: "assistant" and role: "tool" — the SDK never pushes role: "user" into responseMessages. This is why a user message pushed from prepareStep can “auto-disappear” next step.

3. `stepInputMessages` — rebuilt each step, shallow spread

const stepInputMessages = [...initialMessages, ...responseMessages]; // dist:7186

Origin: at the start of each step loop iteration, the SDK rebuilds this array and passes it as the messages param to prepareStep.

Three critical properties:

Rebuilt per step — [...a, ...b] spread creates a new array. Outer identity changes.
Shallow references — elements (message objects) are not deep-copied; they’re still the same objects at initialMessages[i] and responseMessages[j].
Mutating element fields = mutating the original chain — any mutation of stepInputMessages[k].content directly affects the object pointed to by initialMessages[k].

4. `prepareStepResult.messages` — valid for this step only

const prepareStepResult = await prepareStep?.({ ..., messages: stepInputMessages });  // dist:7187
// ...
const stepMessages = prepareStepResult?.messages ?? stepInputMessages;                // dist:7216 — used only for this step's model call

Origin: returned from your prepareStep function as { messages }.

Critical properties:

Used only for this step’s model call (input to convertToLanguageModelPrompt).
Not written back to initialMessages or responseMessages after the call.
Next step’s stepInputMessages rebuilds fresh from [...initialMessages, ...responseMessages] — any additions or deletions in your returned array vanish.

Reference relationship diagram

Core observation: both stepInputMessages and prepareStepResult.messages are “new array per step”, but where the element references come from determines whether your modifications persist.

Three modification modes — lifetime comparison

Operation	Visible this step	Still there next `stepInputMessages`?	Why	Rating
Mutate an existing message object field `msg.content += 'x'` `msg.content.push(part)`	Yes	Yes — permanently polluted	Shared object reference, mutation bleeds into `initialMessages[k]`	Verdict: almost always a bug
Replace the entire messages array `return { messages: [...otherArray] }`	Yes	Depends on element origin: - Elements from `initialMessages` / `responseMessages` still shared - Newly-constructed objects not shared	New array is discarded, but any mutated objects inside remain shared	Verdict: context-dependent
Push a new message object to the returned array `prep.push({ role: "user", content })`	Yes	No — gone next step	New object never entered `initialMessages`, won’t enter `responseMessages` (only accepts assistant/tool)	Verdict: the canonical truly-ephemeral injection
Delete an existing message `return { messages: prep.filter(...) }`	Yes (doesn’t see the deleted one)	Yes (original chain intact)	Returned array not written back; `stepInputMessages` rebuild still contains it	Verdict: misleading — hides, doesn’t delete

Distilled rules:

Want “ephemeral for this step”: return a new array or push a new object.
Want “permanently modify the conversation”: don’t do it in prepareStep — modify the messages input before calling agent.stream().
Never mutate the fields of existing message objects.

Why shallow references, not deep copies

The AI SDK’s choice is deliberate:

Performance: long conversations (100+ messages, huge content) can’t afford a deep clone every step.
Consistency: all upstream callers (engine layer, business layer) see the same message objects the SDK sees internally — convenient for persistence and observability.
Clear contract: the docs state “don’t mutate messages” in prepareStep (officially warned, though not in a prominent place).

The cost is that developers must manually respect immutability — the SDK won’t enforce it. Break it, pay for it.

Practical rules

prepareStep: ({ messages }) => {
  // WRONG — never
  messages[messages.length - 1].content += "injection";
  messages[0].content.push({ type: "text", text: "..." });

  // OK — true ephemeral injection
  return {
    messages: [...messages, { role: "user", content: "injection (this step only)" }],
  };

  // OK — filter for this step (new array, no object mutation)
  return {
    messages: messages.filter((m) => m.role !== "tool"),
  };

  // OK — replace one message for this step (new object)
  return {
    messages: messages.map((m, i) => (i === messages.length - 1 ? { ...m, content: reformat(m.content) } : m)),
  };
};

An often-overlooked detail: when `responseMessages` actually gets written

In the streamText path: the responseMessages.push(...) fires when one step ends, before the next step begins (dist:7623); onStepFinish fires earlier, via a downstream transform (event processor at dist:6776-6810). So:

prepareStep(n+1) sees stepInputMessages already containing the assistant/tool messages generated in step n
onStepFinish(n)’s payload stepResult.response.messages is [...recordedResponseMessages, ...stepMessages] — a new array, but elements are live references (dist:6794)
Both share the same object references — mutating these message objects inside streamText’s onStepFinish pollutes the next step’s stepInputMessages

In the generateText path: stepResult.response.messages is structuredClone(responseMessages) — a deep clone (dist:4602). Mutations do not propagate; it is safe.

Practical rule: if you don’t know which path your code runs on, assume streamText and follow “don’t mutate messages”. The cost of this conservative default is one extra shallow copy; the benefit is “the code doesn’t blow up when migrating from generateText to streamText”.

Why this page is second

The four message chains

1. initialMessages — the input, unchanged for the whole invocation

2. responseMessages — accumulated during the loop, assistant / tool only

3. stepInputMessages — rebuilt each step, shallow spread

4. prepareStepResult.messages — valid for this step only