prepareStep Semantics

Full signature of the per-step hook, 7 overridable fields, 4 typical patterns, and a deep analysis of the mutate-vs-push trap

Why prepareStep is the most critical hook

Almost all context engineering at agent runtime happens inside prepareStep:

  • Compacting long contexts (trigger summary / tiered trimming once token threshold is hit)
  • Injecting system-reminders (ephemeral constraints, not in the conversation history)
  • Dynamically filtering tool visibility (activeTools filtering — show the model only a small candidate subset each step)
  • Forcing tool usage (toolChoice: "required" to prevent premature exit)
  • Dynamically switching models (cheap model for planning, strong model for execution)
  • Applying cache control breakpoints (Anthropic prompt cache)

It’s also the easiest hook to mis-use — get the reference semantics wrong and you pollute the whole conversation. This page systematically covers the full API signature and common patterns, closing with a real mutate-vs-push trap breakdown.

Prerequisite: read Message Reference Model first. This page assumes you know the difference between stepInputMessages and prepareStepResult.messages.

Full signature

type PrepareStepCallback = (options: {
  model: LanguageModel;             // Model for this step (overridable per invocation)
  steps: StepResult[];              // Completed historical steps (doesn't include current)
  stepNumber: number;               // Current step index (0-based)
  messages: ModelMessage[];         // stepInputMessages — see Message Reference Model
  experimental_context: unknown;    // Context passed through layers (business-layer-defined shape)
}) => Promise<{
  model?: LanguageModel;            // Switch model for this step
  system?: string;                  // Override system prompt for this step
  messages?: ModelMessage[];        // Messages sent to model for this step (only)
  toolChoice?: ToolChoice;          // Tool selection strategy for this step
  activeTools?: string[];           // Tool visibility list for this step
  providerOptions?: ProviderOptions;// Per-step provider options (merged with L1/L2's providerOptions)
  experimental_context?: unknown;   // Override context for this step (rare)
}> | undefined;

Return undefined or no return: the SDK uses defaults for this invocation (stepInputMessages, settings-level model, activeTools, toolChoice, etc.).

Return partial fields: the SDK overrides only those fields; others keep their defaults (streamText path dist/index.js:7195-7220; generateText equivalent at 4322-4340).

7 overridable fields

FieldOverridesTypical useScope of override
modelThis step’s LLMLight/heavy split, fallbackThis step only
systemThis step’s system promptA/B prompts, context-based personaThis step only
messagesMessages sent to the modelCompaction, reminder injection, filteringThis step only
toolChoiceTool selection policy"required" to force tool call, "none" to disable, { type: "tool", toolName } to force specificThis step only
activeToolsVisible tools for this stepTool search pool narrowing, hiding dangerous tools by contextThis step only
providerOptionsProvider-specific options (merged with L1/L2’s providerOptions, not replaceddist:7219)Dynamically shift Anthropic cache_control breakpoints, adjust reasoning / thinking budget per step, swap OpenAI seed per stepThis step only
experimental_contextContext seen by downstream tool.executePer-step state passing (rare — usually set once at L1)This step only
(no tools)Tool set is fixed at L1/L2; prepareStep can filter but not add

Critical constraint: prepareStep cannot add new tools — only filter from the fixed ToolSet at L1/L2 via activeTools. If you need “per-step dynamic tool set”, either register everything at L1 and filter with activeTools, or use prepareCall to swap the entire tools (see Lifecycle → prepareCall).

4 typical patterns

Pattern 1: Reminder injection (ephemeral context)

You want to give the model a temporary “system hint” every step (e.g. “no emojis”, “prefer internal tools”) without polluting the conversation history:

prepareStep: async ({ messages, experimental_context }) => {
  const ctx = experimental_context as MyContext;
  if (ctx.reminders.length === 0) return undefined;

  const remindersText = ctx.reminders
    .map(r => `<system-reminder>\n${r}\n</system-reminder>`)
    .join('\n');

  // OK — append a new message; visible this step, gone next step
  return {
    messages: [
      ...messages,
      { role: 'user', content: `[system directive — not user input]\n${remindersText}` },
    ],
  };
},

Why this works: see Message Reference Model row 3 — push is the only “auto-disappears next step” pattern.

Pattern 2: Tool narrowing (big toolset → small candidate)

The “tool search pool” pattern — when an agent is wired to MCP or a dynamic tool-discovery source, the available tool set may be hundreds; but each step the model should only see a small subset relevant to the current context (lower token cost, better selection accuracy):

prepareStep: async ({ experimental_context }) => {
  const ctx = experimental_context as MyContext;
  const pool = ctx.toolDiscovery;
  if (!pool?.active) return undefined;

  return {
    activeTools: pool.getActiveToolNames(),   // e.g. ["read_file", "grep", "tool_search"]
  };
},

Why not use tools: tools is registered at L1 (the full set); activeTools is a name filter — the SDK screens by name in prepareToolsAndToolChoice (dist/index.js:4330-4334), zero cost.

Pattern 3: Force tool usage (prevent premature exit)

Scenario: the agent has a todo list, and must absolutely not produce a final answer while items are pending:

prepareStep: async ({ experimental_context }) => {
  const ctx = experimental_context as MyContext;
  const hasPendingTodos = ctx.todos?.some(t => t.status !== 'done') ?? false;

  if (hasPendingTodos) {
    return { toolChoice: 'required' };   // Force a tool call this step
  }
  return undefined;
},

When to use toolChoice alternatives:

  • 'auto' (default): model chooses freely
  • 'required': must call a tool (which tool is up to the model)
  • 'none': disable tools (pure generation)
  • { type: 'tool', toolName: 'finish' }: force calling the specified tool

Typical uses: 'required' for “pending todos block exit”, { type: 'tool', toolName: 'complete' } at a subagent’s finalize phase to force result delivery, 'none' for “final summarization step must not call tools anymore”.

Pattern 4: Dynamic model switching

prepareStep: async ({ model, messages, stepNumber }) => {
  // First 3 steps use a cheap model for planning, then switch to a strong model
  if (stepNumber < 3) {
    return { model: haikuModel };
  }
  return { model: sonnetModel };
},

Caveat: switching models does NOT switch tool schemas — tool definitions live at L1. Some providers are sensitive to tool schema format (OpenAI vs Anthropic). Before switching models, ensure the tool schema is compatible on both sides.

The mutate-vs-push trap (the canonical anti-pattern)

This is the direct application of Message Reference Model to prepareStep. Below is a real production case — the original version carried the pollution bug, and was later refactored to the push pattern:

Anti-example (polluting version)

// Anti-pattern: mutating objects inside initialMessages
prepareStep: async ({ messages }) => {
  const remindersText = ctx.getRemindersText();
  const lastMessage = messages[messages.length - 1];

  if (lastMessage.role === 'user') {
    // WRONG — direct mutation permanently pollutes this message in initialMessages
    lastMessage.content += `\n${remindersText}`;
  } else {
    // OK — push a new message, gone next step
    messages.push({ role: 'user', content: remindersText });
  }
  return { messages };
},

What’s wrong:

  • lastMessage is a reference to stepInputMessages[k] — the same object as the user message in initialMessages.
  • lastMessage.content += ... creates a new string and assigns it to .content — but the field is overwritten on the original object, which still sits in initialMessages.
  • After step 1 pollutes it, step 2/3/4 each rebuild stepInputMessages = [...initialMessages, ...responseMessages] and spread in this polluted message — the reminder is always there.
  • Worse, each step re-appends the reminder — accumulating exponentially.

Correct version (uniform push)

prepareStep: async ({ messages }) => {
  const remindersText = ctx.getRemindersText();
  if (!remindersText) return undefined;

  // OK — regardless of whether last is user, push a new message
  return {
    messages: [
      ...messages,
      { role: 'user', content: `[system directive — not user input]\n${remindersText}` },
    ],
  };
},

Why this works:

  • A new user message object is created each step; visible this step only.
  • Next step’s stepInputMessages rebuild is pristine [...initialMessages, ...responseMessages]; the new message vanishes.
  • No accumulation risk, no shared-reference pollution.

One-sentence verdict

Never mutate the fields of any message object you receive from prepareStep’s messages parameter.

OperationVerdict
msg.content += 'x'Pollutes
msg.content.push(part)Pollutes
msg.metadata = {...}Pollutes
msg.role = 'system'Pollutes
return { messages: [...messages, newMsg] }Safe
return { messages: messages.filter(...) }Safe (new array, no element mutation)
return { messages: messages.map((m, i) => i === N ? {...m, content: 'new'} : m) }Safe (new object replaces target object)

Step-transition timeline (focused view)

sequenceDiagram participant Loop as streamText loop participant P as prepareStep participant Model rect rgba(200, 220, 255, 0.2) Note over Loop: Step n Loop->>Loop: stepInputMessages =<br/>[...initialMessages, ...responseMessages] Loop->>P: prepareStep({ messages: stepInputMessages }) Note over P: WRONG — if you mutate messages[k].content:<br/>initialMessages[k] permanently polluted P->>Loop: prepareStepResult.messages<br/>(new array for this step) Loop->>Model: convertToLanguageModelPrompt(messages) Note over Loop: OK — returned array is for this step only;<br/>not written back to initialMessages Model-->>Loop: assistant + tool<br/>(pushed to responseMessages) end rect rgba(255, 220, 200, 0.2) Note over Loop: Step n+1 Loop->>Loop: stepInputMessages rebuilt<br/>= [...initialMessages, ...responseMessages] Note over Loop: Arrays returned from step n vanish<br/>but mutated objects persist! Loop->>P: prepareStep({...}) end

Performance notes

prepareStep runs before every step’s model call and is blocking — the model call won’t start until your Promise resolves.

Common performance pitfalls:

  • Sync I/O (fs.readFileSync, sync DB queries) — directly blocks the event loop.
  • Token counting (@anthropic-ai/tokenizer or tiktoken) — first load takes ~100ms for model weights; counting every step adds up quickly. Practical fix: cache the prior round’s token estimate and only recount the message segments that changed.
  • LLM calls (for summarization during compaction) — a single LLM compaction call takes seconds; doubles your per-step wait time. Practical fix: separate “decide whether to compact” (cheap sync by token estimate, only when threshold is hit) from “actually compact” (only then goes to LLM).

Rule of thumb: keep prepareStep’s total time (sync + async) under 200ms. Any higher and each step adds perceptible latency — a 20-step task adds 4+ seconds.

Division of labor vs prepareCall

ScenarioUse prepareCall (once per call)Use prepareStep (once per step)
Pick model by user tierUse: decide once at call startOverkill: recomputes each step
Dynamically switch model by contextCan’t: don’t know at call startUse: decide each step from steps / messages
Inject system-reminderCan’t: reminder is runtime stateUse: correct scenario
Compact messagesCan’t: one-shot at call start, no reaction to growthUse: compact only when token threshold hit
Replace entire tool setUse: prepareStep can’t add toolsCan’t: only filters, doesn’t add

Further reading

Related SDK chapters

Zapvol landing reference

  • Context Compaction — three-tier compaction built on prepareStep
  • Tool Search — dynamic activeTools filtering via prepareStep
  • packages/backend/src/agent/agent-stream.ts — assembly location for the prepareStep implementation
Was this page helpful?