Workflow vs. Agent

The Augmented LLM

Every agentic system starts from the same building block: an LLM enhanced with retrieval, tools, and memory. The model actively generates its own search queries, selects appropriate tools, and determines what information to retain. This augmented LLM is the atom from which all patterns are composed.

Workflows vs. Agents

Anthropic draws a clear distinction between two categories of agentic systems:

Dimension	Workflows	Agents
Control flow	Predefined code paths orchestrate LLM calls	LLM dynamically directs its own process and tool usage
Predictability	High — same input follows same path	Variable — model decides next steps based on context
Best for	Well-defined tasks with consistent structure	Open-ended problems with unpredictable step counts
Trade-off	Less flexible, more reliable	More capable, higher cost, potential compounding errors

The key insight: don’t reach for agents when a workflow suffices. Workflows offer predictability and consistency for well-defined tasks. Agents are the better option when flexibility and model-driven decision-making are needed at scale.

Five Workflow Patterns

1. Prompt Chaining

Decompose a task into sequential steps. Each LLM call processes the output of the previous one, with optional programmatic validation gates between steps.

Structure: LLM → gate → LLM → gate → LLM

When to use: Tasks decomposable into fixed subtasks where you trade latency for higher accuracy.

Example: Generate marketing copy → validate tone → translate to target language.

2. Routing

Classify the input first, then direct it to a specialized handler. Each handler can have its own prompt, tools, and even model.

Structure: Input → classifier → specialized handler A | B | C

When to use: Complex tasks with distinct categories requiring different handling.

Example: Customer service — route general inquiries, refund requests, and technical issues to different specialized prompts.

3. Parallelization

Run multiple LLM calls simultaneously, then aggregate. Two variants:

Sectioning — split independent subtasks across parallel calls
Voting — run the same task multiple times for diverse perspectives

When to use: Speed through division, or higher confidence through multiple perspectives.

Example: Code review — one call checks for security vulnerabilities, another for performance, a third for style.

4. Orchestrator–Workers

A central LLM decomposes the task dynamically, delegates subtasks to worker LLMs, then synthesizes results. Unlike parallelization, the subtasks are not predefined — the orchestrator determines them based on the specific input.

Structure: Input → orchestrator → [worker₁, worker₂, ...workerₙ] → orchestrator → output

When to use: Complex tasks where the required subtasks can’t be predicted in advance.

Example: Multi-file code changes where the orchestrator identifies which files need modification and delegates each to a worker.

5. Evaluator–Optimizer

One LLM generates; another evaluates and provides feedback. The loop continues until quality criteria are met.

Structure: Generator ⇄ Evaluator (loop until pass)

When to use: Clear evaluation criteria exist, and iterative refinement yields measurable improvement.

Example: Literary translation — generator produces translation, evaluator checks for nuance and cultural accuracy, loop refines until both are satisfied.

The Autonomous Agent

When none of the structured workflow patterns fit — when the problem is open-ended, the number of steps is unpredictable, and no hardcoded path will work — you need an autonomous agent.

An autonomous agent is fundamentally simple:

while not done:
    observe environment (tool results, user feedback)
    decide next action
    execute action via tools
    evaluate result

The model maintains control over what to do next and when to stop. It gains ground truth from tool results and code execution at each step, using that feedback to plan subsequent actions.

Key considerations:

Higher cost — open-ended loops consume more tokens than fixed workflows
Compounding errors — each step’s mistakes propagate to subsequent steps
Guardrails required — sandbox execution, permission boundaries, and human-in-the-loop checkpoints

Tool Design: The Agent-Computer Interface

Tools are the agent’s hands. Poor tool design is one of the most common causes of agent failure — not because the model is insufficiently intelligent, but because the interface is ambiguous.

Design principles

Self-contained descriptions — each tool’s docstring should explain exactly when and how to use it, with edge cases and input format requirements
Minimal overlap — if a human engineer can’t definitively say which tool to use in a given situation, neither can the model
Absolute over relative — concrete identifiers (absolute file paths) outperform relative references
Think before commit — provide “thinking” tokens so the model can reason before making irreversible tool calls
Format follows training — keep tool input/output formats close to what appears naturally in training data

“Put yourself in the model’s shoes. Is it obvious how to use this tool, based on the description and parameters alone, or would you need to think carefully about it?”

When NOT to Build an Agent

The most important pattern is knowing when not to use one:

Start with a single, optimized prompt
Add retrieval and in-context examples
Add tools for external interaction
Introduce workflow patterns only when single-call approaches hit their limits
Deploy autonomous agents only when workflows can’t handle the flexibility required

“Success in the LLM space isn’t about building the most sophisticated system. It’s about building the right system for your needs.”

Each layer of complexity adds cost, latency, and failure modes. The right harness is the simplest one that solves the problem.

Source

Building Effective Agents — Anthropic, December 2024.