Skip to main content
Back to Blog

Workflows vs Agents: A Practical Decision Framework

Not every AI system needs autonomous agents. Learn when to use deterministic workflows, when to deploy agents, and how to choose the right architecture for your use case—with decision frameworks, trade-off analysis, and real-world examples.

8 min read
Share:

The Most Common Mistake in AI Architecture

Teams building AI systems frequently make the same mistake: reaching for agents when a workflow would suffice.

The allure is understandable. Agents feel futuristic. They handle ambiguity. They adapt. But that flexibility comes with costs—higher latency, increased token usage, unpredictable behavior, and harder debugging.

The truth is simpler than the hype suggests:

Most production AI systems should be workflows. Agents are for when you genuinely don't know the steps in advance.

This guide provides a practical framework for choosing between workflows and agents—and shows you how to combine them effectively.

This post builds on concepts from Anthropic's excellent Building Effective Agents guide, extending them with cost analysis, migration strategies, and framework-specific guidance.


Defining the Terms

Before diving into trade-offs, let's establish clear definitions.

What is a Workflow?

A workflow is a predefined sequence of steps where the control flow is determined by your code, not the LLM.

Code
┌─────────────────────────────────────────────────────────────────────────┐
│                           WORKFLOW                                       │
│                                                                          │
│   Your code controls what happens next. The LLM executes steps.         │
│                                                                          │
│   ┌─────────┐      ┌─────────┐      ┌─────────┐      ┌─────────┐       │
│   │ Step 1  │ ──→  │ Step 2  │ ──→  │ Step 3  │ ──→  │ Output  │       │
│   │ (LLM)   │      │ (Code)  │      │ (LLM)   │      │         │       │
│   └─────────┘      └─────────┘      └─────────┘      └─────────┘       │
│                                                                          │
│   Examples:                                                              │
│   • Extract data → Validate → Transform → Store                         │
│   • Classify intent → Route → Generate response                         │
│   • Summarize → Translate → Format                                       │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

Key characteristics:

  • Steps are known at design time
  • Branching is explicit (if/else in your code)
  • Predictable execution path
  • Deterministic number of LLM calls
  • Easy to test, debug, and monitor

What is an Agent?

An agent is a system where the LLM decides what to do next. It operates in a loop, choosing actions until it determines the task is complete.

Code
┌─────────────────────────────────────────────────────────────────────────┐
│                            AGENT                                         │
│                                                                          │
│   The LLM controls what happens next. Your code provides tools.         │
│                                                                          │
│                         ┌──────────────┐                                 │
│                         │              │                                 │
│                         ▼              │                                 │
│   ┌─────────┐      ┌─────────┐      ┌─────────┐                         │
│   │  Task   │ ──→  │  LLM    │ ──→  │  Tool   │                         │
│   │  Input  │      │ Decides │      │ Execute │                         │
│   └─────────┘      └─────────┘      └─────────┘                         │
│                         │              │                                 │
│                         │    ┌─────────┘                                 │
│                         ▼    │                                           │
│                    ┌─────────┐                                           │
│                    │  Done?  │ ──→ No ──→ (loop back)                   │
│                    └─────────┘                                           │
│                         │                                                │
│                        Yes                                               │
│                         ▼                                                │
│                    ┌─────────┐                                           │
│                    │ Output  │                                           │
│                    └─────────┘                                           │
│                                                                          │
│   Examples:                                                              │
│   • Research a topic (search → read → search more → synthesize)         │
│   • Debug code (read → hypothesize → test → fix → verify)               │
│   • Plan a trip (unknown number of searches, bookings, comparisons)     │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

Key characteristics:

  • Steps are determined at runtime by the LLM
  • Dynamic execution path
  • Variable number of LLM calls (bounded by max iterations)
  • Can handle novel situations
  • Harder to test, debug, and predict costs

The Spectrum: It's Not Binary

The real world isn't a clean choice between "workflow" and "agent." There's a spectrum of patterns with increasing autonomy:

Code
┌─────────────────────────────────────────────────────────────────────────┐
│                    THE AUTONOMY SPECTRUM                                 │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│  Less Autonomy                                              More Autonomy│
│  More Control                                               Less Control │
│                                                                          │
│  ┌─────────┐  ┌─────────┐  ┌─────────┐  ┌─────────┐  ┌─────────┐       │
│  │ Single  │  │  Chain  │  │ Router  │  │  State  │  │  Agent  │       │
│  │  Call   │  │         │  │         │  │ Machine │  │         │       │
│  └─────────┘  └─────────┘  └─────────┘  └─────────┘  └─────────┘       │
│       │            │            │            │            │             │
│       ▼            ▼            ▼            ▼            ▼             │
│   One prompt   Sequential   LLM picks    LLM triggers  LLM decides     │
│   one output   LLM calls    which path   transitions   everything      │
│                                                                          │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                          │
│  Predictable ◄─────────────────────────────────────────► Flexible       │
│  Cheap       ◄─────────────────────────────────────────► Expensive      │
│  Fast        ◄─────────────────────────────────────────► Slow           │
│  Testable    ◄─────────────────────────────────────────► Unpredictable  │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

Let's examine each pattern:

1. Single LLM Call

The simplest pattern. One prompt, one response.

Code
┌─────────────────────────────────────────────────────────────────────────┐
│                        SINGLE CALL                                       │
│                                                                          │
│   Input ──────────────→ LLM ──────────────→ Output                      │
│                                                                          │
│   Use when:                                                              │
│   • Task is self-contained                                               │
│   • No external data needed                                              │
│   • Classification, simple generation, formatting                        │
│                                                                          │
│   Examples:                                                              │
│   • Sentiment analysis                                                   │
│   • Text summarization                                                   │
│   • Code explanation                                                     │
│   • Translation                                                          │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

2. Chain (Sequential Workflow)

Multiple LLM calls in a fixed sequence. Each step's output feeds the next.

Code
┌─────────────────────────────────────────────────────────────────────────┐
│                          CHAIN                                           │
│                                                                          │
│   ┌─────────┐     ┌─────────┐     ┌─────────┐     ┌─────────┐          │
│   │ Extract │ ──→ │Validate │ ──→ │Summarize│ ──→ │ Format  │          │
│   │  (LLM)  │     │ (Code)  │     │  (LLM)  │     │  (LLM)  │          │
│   └─────────┘     └─────────┘     └─────────┘     └─────────┘          │
│                                                                          │
│   Use when:                                                              │
│   • Steps are known and fixed                                            │
│   • Each step has a clear input/output                                   │
│   • Quality improves with decomposition                                  │
│                                                                          │
│   Examples:                                                              │
│   • Document processing pipeline                                         │
│   • Content generation with review                                       │
│   • Multi-stage analysis                                                 │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

3. Router (Conditional Workflow)

LLM classifies the input, then deterministic code routes to the appropriate handler.

Code
┌─────────────────────────────────────────────────────────────────────────┐
│                          ROUTER                                          │
│                                                                          │
│                      ┌─────────────┐                                     │
│                      │  Classify   │                                     │
│                      │   Intent    │                                     │
│                      │    (LLM)    │                                     │
│                      └─────────────┘                                     │
│                             │                                            │
│              ┌──────────────┼──────────────┐                            │
│              ▼              ▼              ▼                             │
│        ┌──────────┐  ┌──────────┐  ┌──────────┐                         │
│        │ Billing  │  │ Technical│  │  Sales   │                         │
│        │ Handler  │  │ Handler  │  │ Handler  │                         │
│        └──────────┘  └──────────┘  └──────────┘                         │
│                                                                          │
│   Use when:                                                              │
│   • Different input types need different handling                        │
│   • You can enumerate the categories                                     │
│   • Each path is relatively simple                                       │
│                                                                          │
│   Examples:                                                              │
│   • Customer support routing                                             │
│   • Multi-domain Q&A                                                     │
│   • Intent-based chatbots                                                │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

4. State Machine (Complex Workflow)

LLM can trigger transitions between states, but the states and valid transitions are predefined.

Code
┌─────────────────────────────────────────────────────────────────────────┐
│                       STATE MACHINE                                      │
│                                                                          │
│                    ┌─────────────────┐                                   │
│                    │     START       │                                   │
│                    └────────┬────────┘                                   │
│                             │                                            │
│                             ▼                                            │
│                    ┌─────────────────┐                                   │
│        ┌──────────│   GATHERING     │◄─────────┐                        │
│        │          │   INFORMATION   │          │                        │
│        │          └────────┬────────┘          │                        │
│        │                   │                   │                        │
│        │ need more info    │ have enough       │ clarification          │
│        │                   ▼                   │                        │
│        │          ┌─────────────────┐          │                        │
│        └─────────►│   PROCESSING    │──────────┘                        │
│                   └────────┬────────┘                                   │
│                            │                                            │
│              ┌─────────────┼─────────────┐                              │
│              ▼             │             ▼                              │
│     ┌─────────────┐        │    ┌─────────────┐                         │
│     │   SUCCESS   │        │    │   FAILURE   │                         │
│     └─────────────┘        │    └─────────────┘                         │
│                            ▼                                            │
│                   ┌─────────────────┐                                   │
│                   │ HUMAN_ESCALATION│                                   │
│                   └─────────────────┘                                   │
│                                                                          │
│   Use when:                                                              │
│   • Process has well-defined stages                                      │
│   • Transitions depend on LLM judgment                                   │
│   • You need auditability of state changes                               │
│                                                                          │
│   Examples:                                                              │
│   • Order processing bots                                                │
│   • Onboarding flows                                                     │
│   • Approval workflows                                                   │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

5. Agent (Full Autonomy)

LLM decides what to do, executes tools, observes results, and repeats until done.

Code
┌─────────────────────────────────────────────────────────────────────────┐
│                           AGENT                                          │
│                                                                          │
│   ┌─────────────────────────────────────────────────────────────────┐   │
│   │                        AGENT LOOP                                │   │
│   │                                                                  │   │
│   │                    ┌──────────────┐                              │   │
│   │                    │   OBSERVE    │◄───────────────┐             │   │
│   │                    │ (get context)│                │             │   │
│   │                    └──────┬───────┘                │             │   │
│   │                           │                        │             │   │
│   │                           ▼                        │             │   │
│   │                    ┌──────────────┐                │             │   │
│   │                    │    THINK     │                │             │   │
│   │                    │  (reason)    │                │             │   │
│   │                    └──────┬───────┘                │             │   │
│   │                           │                        │             │   │
│   │                           ▼                        │             │   │
│   │                    ┌──────────────┐         ┌──────┴───────┐     │   │
│   │                    │     ACT      │────────►│    UPDATE    │     │   │
│   │                    │ (use tools)  │         │   (memory)   │     │   │
│   │                    └──────┬───────┘         └──────────────┘     │   │
│   │                           │                                      │   │
│   │                           ▼                                      │   │
│   │                    ┌──────────────┐                              │   │
│   │                    │    DONE?     │────► Yes ────► Output        │   │
│   │                    └──────────────┘                              │   │
│   │                                                                  │   │
│   └─────────────────────────────────────────────────────────────────┘   │
│                                                                          │
│   Available Tools:                                                       │
│   ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐               │
│   │ Search │ │  Read  │ │ Write  │ │Execute │ │  API   │               │
│   └────────┘ └────────┘ └────────┘ └────────┘ └────────┘               │
│                                                                          │
│   Use when:                                                              │
│   • You don't know the steps in advance                                  │
│   • Task requires exploration and adaptation                             │
│   • Multiple tools may be needed in varying order                        │
│                                                                          │
│   Examples:                                                              │
│   • Research tasks                                                       │
│   • Debugging and coding                                                 │
│   • Open-ended problem solving                                           │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

The Decision Framework

Here's a practical framework for choosing the right pattern:

The Core Question

Ask yourself:

"Do I know the steps needed to complete this task?"

Code
┌─────────────────────────────────────────────────────────────────────────┐
│                      DECISION TREE                                       │
│                                                                          │
│                 Do you know the steps?                                   │
│                         │                                                │
│            ┌────────────┴────────────┐                                  │
│            ▼                         ▼                                   │
│           YES                        NO                                  │
│            │                         │                                   │
│            ▼                         ▼                                   │
│   ┌─────────────────┐      ┌─────────────────┐                          │
│   │    WORKFLOW     │      │      AGENT      │                          │
│   └────────┬────────┘      └─────────────────┘                          │
│            │                                                             │
│            ▼                                                             │
│   Are there multiple paths?                                              │
│            │                                                             │
│   ┌────────┴────────┐                                                   │
│   ▼                 ▼                                                    │
│  YES                NO                                                   │
│   │                 │                                                    │
│   ▼                 ▼                                                    │
│  ROUTER           CHAIN                                                  │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

Detailed Decision Matrix

FactorPrefer WorkflowPrefer Agent
Steps known?Yes, I can enumerate themNo, depends on input/context
Predictability needed?High (compliance, SLAs)Low (best effort OK)
Cost sensitivityHigh (pay per token matters)Low (quality over cost)
Latency requirementsStrict (< 2 seconds)Flexible (30+ seconds OK)
Failure toleranceLow (must succeed)High (can retry/escalate)
Debugging needsHigh (need to trace issues)Low (output matters most)
Task complexityDecomposable into stepsRequires exploration
DomainNarrow and well-definedBroad or open-ended

The Workflow Suitability Score

Score your use case (1-5 for each factor, higher = more suitable for workflows):

Code
┌─────────────────────────────────────────────────────────────────────────┐
│                  WORKFLOW SUITABILITY SCORECARD                          │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│  Factor                                        Score (1-5)               │
│  ─────────────────────────────────────────────────────────               │
│                                                                          │
│  Predictability of steps                       [ ]                       │
│  (5 = always same steps, 1 = varies wildly)                              │
│                                                                          │
│  Latency sensitivity                           [ ]                       │
│  (5 = must be fast, 1 = can wait minutes)                                │
│                                                                          │
│  Cost sensitivity                              [ ]                       │
│  (5 = every token counts, 1 = cost no object)                            │
│                                                                          │
│  Compliance/audit requirements                 [ ]                       │
│  (5 = heavily regulated, 1 = internal tool)                              │
│                                                                          │
│  Debuggability needs                           [ ]                       │
│  (5 = must trace every decision, 1 = just works)                         │
│                                                                          │
│  ─────────────────────────────────────────────────────────               │
│                                                                          │
│  TOTAL: _____ / 25                                                       │
│                                                                          │
│  20-25: Strong workflow candidate                                        │
│  12-19: Consider hybrid (workflow with agent escape hatches)             │
│   5-11: Agent may be appropriate                                         │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

When to Use Workflows

Workflows shine in scenarios where predictability matters more than flexibility.

Ideal Workflow Use Cases

Code
┌─────────────────────────────────────────────────────────────────────────┐
│                   WORKFLOW SWEET SPOTS                                   │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│  ┌─────────────────────────────────────────────────────────────────┐    │
│  │ DOCUMENT PROCESSING                                              │    │
│  │                                                                  │    │
│  │ Extract → Validate → Classify → Transform → Store               │    │
│  │                                                                  │    │
│  │ Why workflow:                                                    │    │
│  │ • Same steps every time                                          │    │
│  │ • Must process thousands of documents                            │    │
│  │ • Errors need to be traceable                                    │    │
│  │ • Cost per document matters                                      │    │
│  └─────────────────────────────────────────────────────────────────┘    │
│                                                                          │
│  ┌─────────────────────────────────────────────────────────────────┐    │
│  │ CUSTOMER SUPPORT ROUTING                                         │    │
│  │                                                                  │    │
│  │ Classify Intent → Route to Handler → Generate Response           │    │
│  │                                                                  │    │
│  │ Why workflow:                                                    │    │
│  │ • Finite number of intents (billing, technical, sales)           │    │
│  │ • Each handler is specialized and tested                         │    │
│  │ • SLAs require predictable response times                        │    │
│  │ • Need to track which handler resolved which issues              │    │
│  └─────────────────────────────────────────────────────────────────┘    │
│                                                                          │
│  ┌─────────────────────────────────────────────────────────────────┐    │
│  │ CONTENT GENERATION PIPELINE                                      │    │
│  │                                                                  │    │
│  │ Research → Outline → Draft → Review → Edit → Publish            │    │
│  │                                                                  │    │
│  │ Why workflow:                                                    │    │
│  │ • Quality improves with dedicated steps                          │    │
│  │ • Human review can be inserted at known points                   │    │
│  │ • Each step can use specialized prompts/models                   │    │
│  │ • Progress is measurable                                         │    │
│  └─────────────────────────────────────────────────────────────────┘    │
│                                                                          │
│  ┌─────────────────────────────────────────────────────────────────┐    │
│  │ DATA TRANSFORMATION                                              │    │
│  │                                                                  │    │
│  │ Parse → Clean → Enrich → Validate → Output                      │    │
│  │                                                                  │    │
│  │ Why workflow:                                                    │    │
│  │ • Transformations are deterministic                              │    │
│  │ • Schema validation catches errors early                         │    │
│  │ • Batch processing at scale                                      │    │
│  │ • Reproducible results                                           │    │
│  └─────────────────────────────────────────────────────────────────┘    │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

Workflow Benefits

Code
┌─────────────────────────────────────────────────────────────────────────┐
│                     WHY WORKFLOWS WIN                                    │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│  PREDICTABLE COSTS                                                       │
│  ────────────────                                                        │
│  • Fixed number of LLM calls per execution                               │
│  • Can calculate exact cost per request                                  │
│  • No runaway token consumption                                          │
│                                                                          │
│  Agent: 3-50 LLM calls (unpredictable)                                   │
│  Workflow: 4 LLM calls (always)                                          │
│                                                                          │
│  ─────────────────────────────────────────────────────────────────────   │
│                                                                          │
│  CONSISTENT LATENCY                                                      │
│  ─────────────────                                                       │
│  • Steps run in known order                                              │
│  • Can parallelize independent steps                                     │
│  • Meet SLAs reliably                                                    │
│                                                                          │
│  Agent: 2-60 seconds (variable)                                          │
│  Workflow: 3-4 seconds (consistent)                                      │
│                                                                          │
│  ─────────────────────────────────────────────────────────────────────   │
│                                                                          │
│  DEBUGGABILITY                                                           │
│  ────────────                                                            │
│  • Each step has clear input/output                                      │
│  • Failures localized to specific step                                   │
│  • Can replay individual steps                                           │
│  • Audit trail is straightforward                                        │
│                                                                          │
│  ─────────────────────────────────────────────────────────────────────   │
│                                                                          │
│  TESTABILITY                                                             │
│  ──────────                                                              │
│  • Unit test each step independently                                     │
│  • Mock LLM responses for deterministic tests                            │
│  • Integration test the pipeline                                         │
│  • Regression testing is meaningful                                      │
│                                                                          │
│  ─────────────────────────────────────────────────────────────────────   │
│                                                                          │
│  OPTIMIZABILITY                                                          │
│  ─────────────                                                           │
│  • Profile each step independently                                       │
│  • Use smaller/faster models for simple steps                            │
│  • Cache intermediate results                                            │
│  • Parallelize where possible                                            │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

When to Use Agents

Agents excel when the path to a solution isn't known in advance.

Ideal Agent Use Cases

Code
┌─────────────────────────────────────────────────────────────────────────┐
│                      AGENT SWEET SPOTS                                   │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│  ┌─────────────────────────────────────────────────────────────────┐    │
│  │ RESEARCH TASKS                                                   │    │
│  │                                                                  │    │
│  │ "What are the key factors affecting renewable energy adoption    │    │
│  │  in Southeast Asia?"                                             │    │
│  │                                                                  │    │
│  │ Why agent:                                                       │    │
│  │ • Don't know what sources will be relevant                       │    │
│  │ • May need to follow unexpected leads                            │    │
│  │ • Depth of research depends on what's found                      │    │
│  │ • Quality requires iterative refinement                          │    │
│  └─────────────────────────────────────────────────────────────────┘    │
│                                                                          │
│  ┌─────────────────────────────────────────────────────────────────┐    │
│  │ CODING & DEBUGGING                                               │    │
│  │                                                                  │    │
│  │ "Fix the authentication bug in our login flow"                   │    │
│  │                                                                  │    │
│  │ Why agent:                                                       │    │
│  │ • Need to explore codebase to find relevant files                │    │
│  │ • Debugging requires hypothesis-test cycles                      │    │
│  │ • Solution may require changes in multiple files                 │    │
│  │ • Must verify fix actually works                                 │    │
│  └─────────────────────────────────────────────────────────────────┘    │
│                                                                          │
│  ┌─────────────────────────────────────────────────────────────────┐    │
│  │ COMPLEX CUSTOMER SUPPORT                                         │    │
│  │                                                                  │    │
│  │ "My order was charged twice and I need a refund but the         │    │
│  │  original payment method expired"                                │    │
│  │                                                                  │    │
│  │ Why agent:                                                       │    │
│  │ • Multiple systems to query (orders, payments, customer)         │    │
│  │ • Resolution path depends on what's found                        │    │
│  │ • May need clarification from customer                           │    │
│  │ • Edge cases require judgment                                    │    │
│  └─────────────────────────────────────────────────────────────────┘    │
│                                                                          │
│  ┌─────────────────────────────────────────────────────────────────┐    │
│  │ DATA ANALYSIS & EXPLORATION                                      │    │
│  │                                                                  │    │
│  │ "Analyze our sales data and find interesting patterns"           │    │
│  │                                                                  │    │
│  │ Why agent:                                                       │    │
│  │ • "Interesting" is subjective and discovered                     │    │
│  │ • May need to try multiple analysis approaches                   │    │
│  │ • Follow-up analysis depends on initial findings                 │    │
│  │ • Visualization choices depend on data shape                     │    │
│  └─────────────────────────────────────────────────────────────────┘    │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

The Agent Advantage: Handling the Unknown

Code
┌─────────────────────────────────────────────────────────────────────────┐
│                    AGENT FLEXIBILITY                                     │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│  Workflow for "fix bug":            Agent for "fix bug":                 │
│                                                                          │
│  ┌──────────────────────┐          ┌──────────────────────┐             │
│  │ 1. Read error log    │          │ 1. Read error log    │             │
│  │ 2. Find file         │          │ 2. Search codebase   │             │
│  │ 3. Apply fix pattern │          │    (found 3 related  │             │
│  │ 4. Test              │          │     files)           │             │
│  │ 5. Done              │          │ 3. Read main file    │             │
│  └──────────────────────┘          │ 4. Hypothesis: auth  │             │
│                                    │    token expired     │             │
│  What if bug is in                 │ 5. Check token logic │             │
│  unexpected location?              │ 6. Found real issue: │             │
│  What if error log                 │    race condition    │             │
│  is misleading?                    │ 7. Read related file │             │
│  What if fix requires              │ 8. Design fix        │             │
│  architectural change?             │ 9. Apply fix         │             │
│                                    │ 10. Test - fails     │             │
│  ❌ Workflow fails                 │ 11. Investigate more │             │
│                                    │ 12. Fix edge case    │             │
│                                    │ 13. Test - passes    │             │
│                                    │ 14. Done             │             │
│                                    └──────────────────────┘             │
│                                                                          │
│                                    ✅ Agent succeeds                     │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

Agent Costs to Consider

Agents aren't free. Here's what you're trading for flexibility:

Code
┌─────────────────────────────────────────────────────────────────────────┐
│                      AGENT TRADE-OFFS                                    │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│  TOKEN COST                                                              │
│  ──────────                                                              │
│  • Each loop iteration includes full context                             │
│  • Tool results add to context each step                                 │
│  • 10 iterations × 4K context = 40K+ tokens per request                  │
│                                                                          │
│  Typical cost comparison:                                                │
│  ┌─────────────────────────────────────────────────────────────────┐    │
│  │  Pattern      │ Tokens/Request │ Cost (GPT-4) │ Cost (Claude)   │    │
│  ├───────────────┼────────────────┼──────────────┼─────────────────│    │
│  │  Single call  │     2,000      │    $0.06     │     $0.04       │    │
│  │  3-step chain │     6,000      │    $0.18     │     $0.12       │    │
│  │  Agent (avg)  │    35,000      │    $1.05     │     $0.70       │    │
│  │  Agent (max)  │   100,000      │    $3.00     │     $2.00       │    │
│  └─────────────────────────────────────────────────────────────────┘    │
│                                                                          │
│  ─────────────────────────────────────────────────────────────────────   │
│                                                                          │
│  LATENCY                                                                 │
│  ───────                                                                 │
│  • Each LLM call adds 1-5 seconds                                        │
│  • Tool execution adds variable time                                     │
│  • Cannot parallelize dependent steps                                    │
│                                                                          │
│  ┌─────────────────────────────────────────────────────────────────┐    │
│  │  Pattern      │ Min Latency │ Typical │ Max Latency              │    │
│  ├───────────────┼─────────────┼─────────┼──────────────────────────│    │
│  │  Single call  │    0.5s     │   1.5s  │    3s                    │    │
│  │  3-step chain │    1.5s     │   4.5s  │    10s                   │    │
│  │  Agent        │    3s       │   20s   │    120s+                 │    │
│  └─────────────────────────────────────────────────────────────────┘    │
│                                                                          │
│  ─────────────────────────────────────────────────────────────────────   │
│                                                                          │
│  PREDICTABILITY                                                          │
│  ──────────────                                                          │
│  • Same input may produce different outputs                              │
│  • Execution path varies by run                                          │
│  • Harder to guarantee SLAs                                              │
│  • Testing requires statistical approaches                               │
│                                                                          │
│  ─────────────────────────────────────────────────────────────────────   │
│                                                                          │
│  DEBUGGING                                                               │
│  ─────────                                                               │
│  • Long traces to analyze                                                │
│  • Non-deterministic reproduction                                        │
│  • "Why did it do that?" often unclear                                   │
│  • Requires sophisticated observability                                  │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

Hybrid Patterns: The Best of Both Worlds

In practice, the best systems combine workflows and agents strategically.

Pattern 1: Workflow with Agent Escape Hatch

Start with a workflow, but allow escalation to an agent for edge cases.

Code
┌─────────────────────────────────────────────────────────────────────────┐
│              WORKFLOW WITH AGENT ESCAPE HATCH                            │
│                                                                          │
│                      ┌─────────────────┐                                 │
│                      │   User Input    │                                 │
│                      └────────┬────────┘                                 │
│                               │                                          │
│                               ▼                                          │
│  ┌────────────────────────────────────────────────────────────────────┐ │
│  │                     WORKFLOW LAYER                                  │ │
│  │                                                                     │ │
│  │  ┌─────────┐     ┌─────────┐     ┌─────────┐                       │ │
│  │  │Classify │ ──→ │ Route   │ ──→ │ Handle  │                       │ │
│  │  └─────────┘     └────┬────┘     └────┬────┘                       │ │
│  │                       │               │                             │ │
│  │                       ▼               ▼                             │ │
│  │                  Can handle?     Confident?                         │ │
│  │                       │               │                             │ │
│  │              ┌────────┴────┐    ┌─────┴─────┐                       │ │
│  │              ▼             ▼    ▼           ▼                       │ │
│  │             Yes           No   Yes          No                      │ │
│  │              │             │    │           │                       │ │
│  │              ▼             │    ▼           │                       │ │
│  │         ┌────────┐         │  Output       │                       │ │
│  │         │Response│         │               │                       │ │
│  │         └────────┘         │               │                       │ │
│  │                            │               │                       │ │
│  └────────────────────────────┼───────────────┼───────────────────────┘ │
│                               │               │                          │
│                               └───────┬───────┘                          │
│                                       │                                  │
│                                       ▼                                  │
│  ┌────────────────────────────────────────────────────────────────────┐ │
│  │                      AGENT LAYER                                    │ │
│  │                                                                     │ │
│  │              ┌─────────────────────────────┐                        │ │
│  │              │    Full Agent Loop          │                        │ │
│  │              │    (tools, reasoning,       │                        │ │
│  │              │     multi-step resolution)  │                        │ │
│  │              └─────────────────────────────┘                        │ │
│  │                                                                     │ │
│  └────────────────────────────────────────────────────────────────────┘ │
│                                                                          │
│  Benefits:                                                               │
│  • 90% of requests handled by fast, cheap workflow                       │
│  • 10% of complex cases get full agent treatment                         │
│  • Clear escalation path                                                 │
│  • Cost optimization without sacrificing capability                      │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

Pattern 2: Agent as Workflow Orchestrator

Use an agent to coordinate multiple specialized workflows.

Code
┌─────────────────────────────────────────────────────────────────────────┐
│              AGENT AS WORKFLOW ORCHESTRATOR                              │
│                                                                          │
│                      ┌─────────────────┐                                 │
│                      │  Complex Task   │                                 │
│                      │                 │                                 │
│                      │ "Analyze Q3     │                                 │
│                      │  performance    │                                 │
│                      │  and recommend  │                                 │
│                      │  improvements"  │                                 │
│                      └────────┬────────┘                                 │
│                               │                                          │
│                               ▼                                          │
│  ┌────────────────────────────────────────────────────────────────────┐ │
│  │                    ORCHESTRATOR AGENT                               │ │
│  │                                                                     │ │
│  │   The agent decides WHICH workflows to run and in WHAT ORDER       │ │
│  │                                                                     │ │
│  │   Current plan:                                                     │ │
│  │   1. Run financial analysis workflow ✓                              │ │
│  │   2. Run competitor analysis workflow ✓                             │ │
│  │   3. Run recommendation workflow (in progress)                      │ │
│  │                                                                     │ │
│  └─────────────────────┬───────────────────────────────────────────────┘ │
│                        │                                                 │
│           ┌────────────┼────────────┐                                   │
│           ▼            ▼            ▼                                    │
│  ┌──────────────┐ ┌──────────────┐ ┌──────────────┐                     │
│  │  Financial   │ │  Competitor  │ │Recommendation│                     │
│  │  Analysis    │ │  Analysis    │ │  Generation  │                     │
│  │  WORKFLOW    │ │  WORKFLOW    │ │  WORKFLOW    │                     │
│  │              │ │              │ │              │                     │
│  │ Extract →    │ │ Search →     │ │ Synthesize → │                     │
│  │ Calculate →  │ │ Compare →    │ │ Prioritize → │                     │
│  │ Summarize    │ │ Summarize    │ │ Format       │                     │
│  └──────────────┘ └──────────────┘ └──────────────┘                     │
│                                                                          │
│  Benefits:                                                               │
│  • Agent flexibility for high-level planning                             │
│  • Workflow efficiency for individual tasks                              │
│  • Each workflow is tested and optimized independently                   │
│  • Can add new workflows without changing agent                          │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

Pattern 3: Workflow with Embedded Agent Steps

A workflow where specific steps are handled by agents.

Code
┌─────────────────────────────────────────────────────────────────────────┐
│              WORKFLOW WITH EMBEDDED AGENT STEPS                          │
│                                                                          │
│  ┌─────────┐     ┌─────────┐     ┌─────────┐     ┌─────────┐           │
│  │ Parse   │ ──→ │Research │ ──→ │Validate │ ──→ │ Format  │           │
│  │ Input   │     │ (AGENT) │     │  Data   │     │ Output  │           │
│  │         │     │         │     │         │     │         │           │
│  │ [Code]  │     │ [Agent] │     │ [Code]  │     │ [LLM]   │           │
│  └─────────┘     └─────────┘     └─────────┘     └─────────┘           │
│                        │                                                 │
│                        ▼                                                 │
│               ┌─────────────────┐                                        │
│               │   Agent Loop    │                                        │
│               │                 │                                        │
│               │ • Search web    │                                        │
│               │ • Read sources  │                                        │
│               │ • Verify facts  │                                        │
│               │ • Compile notes │                                        │
│               │                 │                                        │
│               │ (bounded: max   │                                        │
│               │  5 iterations)  │                                        │
│               └─────────────────┘                                        │
│                                                                          │
│  Benefits:                                                               │
│  • Known workflow structure                                              │
│  • Agent used only where needed (research step)                          │
│  • Bounded agent (max iterations)                                        │
│  • Rest of pipeline is deterministic                                     │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

Pattern 4: Tiered Complexity

Route requests to increasingly capable (and expensive) handlers.

Code
┌─────────────────────────────────────────────────────────────────────────┐
│                    TIERED COMPLEXITY                                     │
│                                                                          │
│                      ┌─────────────────┐                                 │
│                      │   User Query    │                                 │
│                      └────────┬────────┘                                 │
│                               │                                          │
│                               ▼                                          │
│                      ┌─────────────────┐                                 │
│                      │    Classify     │                                 │
│                      │   Complexity    │                                 │
│                      └────────┬────────┘                                 │
│                               │                                          │
│              ┌────────────────┼────────────────┐                         │
│              ▼                ▼                ▼                          │
│                                                                          │
│  ┌────────────────┐  ┌────────────────┐  ┌────────────────┐             │
│  │   TIER 1       │  │   TIER 2       │  │   TIER 3       │             │
│  │   Simple       │  │   Standard     │  │   Complex      │             │
│  │                │  │                │  │                │             │
│  │ Single LLM     │  │ 3-step         │  │ Full Agent     │             │
│  │ call           │  │ workflow       │  │ with tools     │             │
│  │                │  │                │  │                │             │
│  │ ~$0.01         │  │ ~$0.05         │  │ ~$0.50         │             │
│  │ ~1 second      │  │ ~4 seconds     │  │ ~30 seconds    │             │
│  │                │  │                │  │                │             │
│  │ "What's your   │  │ "Help me       │  │ "Debug this    │             │
│  │  return        │  │  write an      │  │  production    │             │
│  │  policy?"      │  │  email to..."  │  │  issue..."     │             │
│  └────────────────┘  └────────────────┘  └────────────────┘             │
│         │                    │                    │                      │
│         └────────────────────┼────────────────────┘                      │
│                              ▼                                           │
│                      ┌─────────────────┐                                 │
│                      │   Response      │                                 │
│                      └─────────────────┘                                 │
│                                                                          │
│  Traffic distribution (typical):                                         │
│  • Tier 1: 70% of requests (cheap and fast)                              │
│  • Tier 2: 25% of requests (moderate cost)                               │
│  • Tier 3: 5% of requests (expensive but necessary)                      │
│                                                                          │
│  Overall cost: Much lower than treating everything as Tier 3             │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

Common Anti-Patterns

Anti-Pattern 1: Over-Agentification

Using an agent when a simple workflow would suffice.

Code
┌─────────────────────────────────────────────────────────────────────────┐
│                    OVER-AGENTIFICATION                                   │
│                                                                          │
│  ❌ BAD: Agent for email classification                                  │
│                                                                          │
│  Agent loop:                                                             │
│  1. Read email                                                           │
│  2. Think about category                                                 │
│  3. Maybe search for similar emails?                                     │
│  4. Think more                                                           │
│  5. Decide category                                                      │
│  6. Double-check decision                                                │
│  7. Done                                                                 │
│                                                                          │
│  Cost: ~15,000 tokens, ~8 seconds                                        │
│                                                                          │
│  ─────────────────────────────────────────────────────────────────────   │
│                                                                          │
│  ✅ GOOD: Single LLM call with structured output                         │
│                                                                          │
│  Prompt: "Classify this email into: billing, support, sales, spam"       │
│  Output: { "category": "support", "confidence": 0.94 }                   │
│                                                                          │
│  Cost: ~500 tokens, ~1 second                                            │
│                                                                          │
│  ─────────────────────────────────────────────────────────────────────   │
│                                                                          │
│  Signs you're over-agentifying:                                          │
│  • Agent always takes the same steps                                     │
│  • Agent rarely uses more than 1-2 tools                                 │
│  • Task has a clear, predictable structure                               │
│  • You could write down the steps in advance                             │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

Anti-Pattern 2: Under-Tooled Workflows

Workflows that should use tools but rely entirely on LLM knowledge.

Code
┌─────────────────────────────────────────────────────────────────────────┐
│                    UNDER-TOOLED WORKFLOWS                                │
│                                                                          │
│  ❌ BAD: Generate report from LLM knowledge only                         │
│                                                                          │
│  ┌─────────┐     ┌─────────┐     ┌─────────┐                            │
│  │ Ask LLM │ ──→ │ Ask LLM │ ──→ │ Ask LLM │                            │
│  │  stats  │     │ trends  │     │ format  │                            │
│  └─────────┘     └─────────┘     └─────────┘                            │
│                                                                          │
│  Problem: LLM knowledge is stale, may hallucinate                        │
│                                                                          │
│  ─────────────────────────────────────────────────────────────────────   │
│                                                                          │
│  ✅ GOOD: Workflow with tool calls for fresh data                        │
│                                                                          │
│  ┌─────────┐     ┌─────────┐     ┌─────────┐     ┌─────────┐           │
│  │ Query   │ ──→ │ Query   │ ──→ │ LLM     │ ──→ │ Format  │           │
│  │   DB    │     │   API   │     │ Analyze │     │ Output  │           │
│  └─────────┘     └─────────┘     └─────────┘     └─────────┘           │
│   [Tool]          [Tool]          [LLM]           [Code]                │
│                                                                          │
│  Solution: Ground LLM in real data via tools                             │
│                                                                          │
│  ─────────────────────────────────────────────────────────────────────   │
│                                                                          │
│  Signs you're under-tooling:                                             │
│  • Workflow produces outdated information                                │
│  • Hallucinations in factual claims                                      │
│  • No connection to real data sources                                    │
│  • Could be improved with retrieval or APIs                              │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

Anti-Pattern 3: Unbounded Agents

Agents without proper limits that can spiral out of control.

Code
┌─────────────────────────────────────────────────────────────────────────┐
│                      UNBOUNDED AGENTS                                    │
│                                                                          │
│  ❌ BAD: Agent with no limits                                            │
│                                                                          │
│  • No max iterations → Can loop forever                                  │
│  • No token budget → Can consume unlimited tokens                        │
│  • No timeout → Can run for hours                                        │
│  • No tool restrictions → Can call expensive APIs repeatedly             │
│                                                                          │
│  Real incident: Agent spent $200 in API calls trying to                  │
│  "thoroughly research" a simple question                                 │
│                                                                          │
│  ─────────────────────────────────────────────────────────────────────   │
│                                                                          │
│  ✅ GOOD: Agent with proper guardrails                                   │
│                                                                          │
│  ┌─────────────────────────────────────────────────────────────────┐    │
│  │                     BOUNDED AGENT                                │    │
│  │                                                                  │    │
│  │  Limits:                                                         │    │
│  │  ├── max_iterations: 15                                          │    │
│  │  ├── max_tokens: 50,000                                          │    │
│  │  ├── timeout: 120 seconds                                        │    │
│  │  ├── max_tool_calls: 20                                          │    │
│  │  └── budget: $1.00 per request                                   │    │
│  │                                                                  │    │
│  │  On limit reached:                                               │    │
│  │  ├── Gracefully summarize progress                               │    │
│  │  ├── Return partial results                                      │    │
│  │  └── Log for monitoring                                          │    │
│  │                                                                  │    │
│  └─────────────────────────────────────────────────────────────────┘    │
│                                                                          │
│  Always bound: iterations, tokens, time, cost                            │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

Anti-Pattern 4: Premature Agent Optimization

Optimizing agent performance before validating the use case.

Code
┌─────────────────────────────────────────────────────────────────────────┐
│                 PREMATURE AGENT OPTIMIZATION                             │
│                                                                          │
│  ❌ BAD: Building sophisticated agent infrastructure before              │
│          proving you need agents at all                                  │
│                                                                          │
│  Week 1: "Let's build a multi-agent system!"                             │
│  Week 4: Complex agent framework with 10 agent types                     │
│  Week 8: Realize 90% of use cases are simple classifications             │
│  Week 12: Rewrite everything as workflows                                │
│                                                                          │
│  ─────────────────────────────────────────────────────────────────────   │
│                                                                          │
│  ✅ GOOD: Iterative complexity                                           │
│                                                                          │
│  1. Start with simplest solution (single LLM call)                       │
│          │                                                               │
│          ▼                                                               │
│  2. Identify failure cases                                               │
│          │                                                               │
│          ▼                                                               │
│  3. Add complexity only where needed                                     │
│          │                                                               │
│          ├──→ Most cases: Stay simple                                    │
│          │                                                               │
│          └──→ Edge cases: Add workflow steps or agent                    │
│                                                                          │
│  Principle: Earn complexity through proven need                          │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

Real-World Architecture Examples

Example 1: Customer Support System

Code
┌─────────────────────────────────────────────────────────────────────────┐
│              CUSTOMER SUPPORT ARCHITECTURE                               │
│                                                                          │
│                      ┌─────────────────┐                                 │
│                      │ Customer Query  │                                 │
│                      └────────┬────────┘                                 │
│                               │                                          │
│                               ▼                                          │
│  ┌─────────────────────────────────────────────────────────────────────┐│
│  │                    ROUTER (Workflow)                                 ││
│  │                                                                      ││
│  │  Intent Classification → Complexity Assessment → Route               ││
│  │                                                                      ││
│  └───────────────────────────┬─────────────────────────────────────────┘│
│                              │                                           │
│         ┌────────────────────┼────────────────────┐                     │
│         ▼                    ▼                    ▼                      │
│  ┌─────────────┐      ┌─────────────┐      ┌─────────────┐              │
│  │    FAQ      │      │   SIMPLE    │      │   COMPLEX   │              │
│  │  (Workflow) │      │  (Workflow) │      │   (Agent)   │              │
│  │             │      │             │      │             │              │
│  │ RAG lookup  │      │ Template +  │      │ Full agent  │              │
│  │ + format    │      │ DB lookup   │      │ with tools  │              │
│  │             │      │             │      │             │              │
│  │ "What's     │      │ "Check my   │      │ "My order   │              │
│  │  your       │      │  order      │      │  shipped to │              │
│  │  return     │      │  status"    │      │  wrong      │              │
│  │  policy?"   │      │             │      │  address    │              │
│  │             │      │             │      │  and I need │              │
│  │ 60% of      │      │ 30% of      │      │  it         │              │
│  │ queries     │      │ queries     │      │  redirected"│              │
│  │             │      │             │      │             │              │
│  │ ~1 second   │      │ ~3 seconds  │      │ 10% of      │              │
│  │ ~$0.01      │      │ ~$0.05      │      │ queries     │              │
│  └─────────────┘      └─────────────┘      │             │              │
│                                            │ ~30 seconds │              │
│                                            │ ~$0.50      │              │
│                                            └─────────────┘              │
│                                                                          │
│  Blended cost: ~$0.05 average (vs $0.50 if all were agents)             │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

Example 2: Document Processing Pipeline

Code
┌─────────────────────────────────────────────────────────────────────────┐
│              DOCUMENT PROCESSING PIPELINE                                │
│                                                                          │
│  This is almost entirely workflows—agents are overkill here.            │
│                                                                          │
│                      ┌─────────────────┐                                 │
│                      │    Document     │                                 │
│                      │    Upload       │                                 │
│                      └────────┬────────┘                                 │
│                               │                                          │
│                               ▼                                          │
│  ┌─────────────────────────────────────────────────────────────────────┐│
│  │                    INGESTION WORKFLOW                                ││
│  │                                                                      ││
│  │  ┌────────┐   ┌────────┐   ┌────────┐   ┌────────┐   ┌────────┐   ││
│  │  │ Parse  │ → │ Clean  │ → │ Chunk  │ → │ Embed  │ → │ Store  │   ││
│  │  │ [Code] │   │ [Code] │   │ [Code] │   │ [API]  │   │ [DB]   │   ││
│  │  └────────┘   └────────┘   └────────┘   └────────┘   └────────┘   ││
│  │                                                                      ││
│  └─────────────────────────────────────────────────────────────────────┘│
│                                                                          │
│                               ▼                                          │
│  ┌─────────────────────────────────────────────────────────────────────┐│
│  │                    EXTRACTION WORKFLOW                               ││
│  │                                                                      ││
│  │  ┌────────┐   ┌────────┐   ┌────────┐   ┌────────┐                 ││
│  │  │Identify│ → │Extract │ → │Validate│ → │ Store  │                 ││
│  │  │ Fields │   │ Values │   │ Schema │   │  Data  │                 ││
│  │  │ [LLM]  │   │ [LLM]  │   │ [Code] │   │ [DB]   │                 ││
│  │  └────────┘   └────────┘   └────────┘   └────────┘                 ││
│  │                                                                      ││
│  └─────────────────────────────────────────────────────────────────────┘│
│                                                                          │
│                               ▼                                          │
│  ┌─────────────────────────────────────────────────────────────────────┐│
│  │                    QUERY WORKFLOW                                    ││
│  │                                                                      ││
│  │  ┌────────┐   ┌────────┐   ┌────────┐   ┌────────┐                 ││
│  │  │Rewrite │ → │Retrieve│ → │ Rank   │ → │Generate│                 ││
│  │  │ Query  │   │  Docs  │   │Results │   │ Answer │                 ││
│  │  │ [LLM]  │   │ [DB]   │   │ [LLM]  │   │ [LLM]  │                 ││
│  │  └────────┘   └────────┘   └────────┘   └────────┘                 ││
│  │                                                                      ││
│  └─────────────────────────────────────────────────────────────────────┘│
│                                                                          │
│  Note: All workflows, no agents. Predictable, testable, fast.           │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

Example 3: Coding Assistant

Code
┌─────────────────────────────────────────────────────────────────────────┐
│              CODING ASSISTANT ARCHITECTURE                               │
│                                                                          │
│  This legitimately needs agents—coding is exploratory.                  │
│                                                                          │
│                      ┌─────────────────┐                                 │
│                      │  User Request   │                                 │
│                      │                 │                                 │
│                      │ "Fix the bug    │                                 │
│                      │  in user auth"  │                                 │
│                      └────────┬────────┘                                 │
│                               │                                          │
│                               ▼                                          │
│  ┌─────────────────────────────────────────────────────────────────────┐│
│  │                    TRIAGE (Workflow)                                 ││
│  │                                                                      ││
│  │  Classify request → Check permissions → Route                        ││
│  │                                                                      ││
│  │  Simple explanation? → Direct LLM response                           ││
│  │  Code generation?    → Coding agent                                  ││
│  │  Bug fix?            → Debug agent                                   ││
│  │                                                                      ││
│  └───────────────────────────┬─────────────────────────────────────────┘│
│                              │                                           │
│                              ▼                                           │
│  ┌─────────────────────────────────────────────────────────────────────┐│
│  │                    DEBUG AGENT                                       ││
│  │                                                                      ││
│  │  ┌─────────────────────────────────────────────────────────────┐    ││
│  │  │                    AGENT LOOP                                │    ││
│  │  │                                                              │    ││
│  │  │  Tools available:                                            │    ││
│  │  │  • read_file     • search_code   • run_tests                │    ││
│  │  │  • edit_file     • search_docs   • run_linter               │    ││
│  │  │  • list_files    • git_history   • execute_code             │    ││
│  │  │                                                              │    ││
│  │  │  Agent trace:                                                │    ││
│  │  │  1. search_code("auth", "login") → 3 files found            │    ││
│  │  │  2. read_file("auth/login.py") → found suspicious code      │    ││
│  │  │  3. read_file("auth/tokens.py") → found related bug         │    ││
│  │  │  4. git_history("auth/") → recent change introduced bug     │    ││
│  │  │  5. edit_file("auth/login.py", fix) → applied fix           │    ││
│  │  │  6. run_tests("auth/") → 2 tests fail                       │    ││
│  │  │  7. read_file("tests/test_auth.py") → understand failures   │    ││
│  │  │  8. edit_file("auth/login.py", better_fix) → refined fix    │    ││
│  │  │  9. run_tests("auth/") → all pass                           │    ││
│  │  │  10. Done                                                    │    ││
│  │  │                                                              │    ││
│  │  └─────────────────────────────────────────────────────────────┘    ││
│  │                                                                      ││
│  │  This REQUIRES an agent—steps depend on what's found.               ││
│  │                                                                      ││
│  └─────────────────────────────────────────────────────────────────────┘│
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

Cost Economics: The Numbers That Matter

Understanding the real costs helps you make informed decisions. Here's what each pattern actually costs at scale.

Cost Per Request Comparison

Code
┌─────────────────────────────────────────────────────────────────────────┐
│                    COST PER REQUEST (GPT-4o Pricing)                     │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│  Pattern              Tokens/Req    Cost/Req    1M Requests/Month       │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                          │
│  Single LLM Call      1,500         $0.008      $8,000                  │
│  (classification)                                                        │
│                                                                          │
│  Single LLM Call      3,000         $0.015      $15,000                 │
│  (generation)                                                            │
│                                                                          │
│  3-Step Chain         8,000         $0.040      $40,000                 │
│  (sequential)                                                            │
│                                                                          │
│  Router + Handler     5,000         $0.025      $25,000                 │
│  (conditional)                                                           │
│                                                                          │
│  State Machine        12,000        $0.060      $60,000                 │
│  (5 transitions avg)                                                     │
│                                                                          │
│  Agent (simple)       25,000        $0.125      $125,000                │
│  (5 iterations avg)                                                      │
│                                                                          │
│  Agent (complex)      60,000        $0.300      $300,000                │
│  (12 iterations avg)                                                     │
│                                                                          │
│  Multi-Agent          120,000       $0.600      $600,000                │
│  (3 agents coordinating)                                                 │
│                                                                          │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                          │
│  Note: Using GPT-4o at $2.50/1M input, $10/1M output (Dec 2024)         │
│  Claude Sonnet is similar. GPT-4o-mini is ~20x cheaper.                 │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

The Hidden Costs

Code
┌─────────────────────────────────────────────────────────────────────────┐
│                      HIDDEN COST FACTORS                                 │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│  CONTEXT ACCUMULATION (Agents)                                           │
│  ─────────────────────────────                                           │
│  Each iteration includes ALL previous context:                           │
│                                                                          │
│  Iteration 1:  System + Task              = 2,000 tokens                │
│  Iteration 2:  + Response + Tool Result   = 4,500 tokens                │
│  Iteration 3:  + Response + Tool Result   = 7,000 tokens                │
│  Iteration 4:  + Response + Tool Result   = 9,500 tokens                │
│  ...                                                                     │
│  Iteration 10: + All accumulated          = 25,000 tokens               │
│                                                                          │
│  Total tokens for 10 iterations: ~95,000 (not 25,000!)                  │
│                                                                          │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                          │
│  RETRY AND ERROR COSTS                                                   │
│  ─────────────────────                                                   │
│                                                                          │
│  Workflows: Failed step retries only that step                           │
│  Agents: Failed iteration still consumed full context tokens             │
│                                                                          │
│  If 10% of agent runs need 2 extra iterations due to errors:            │
│  Additional cost = 10% × 2 iterations × accumulated context             │
│                  = ~15-20% cost increase                                 │
│                                                                          │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                          │
│  TOOL EXECUTION COSTS                                                    │
│  ────────────────────                                                    │
│                                                                          │
│  • External API calls (search, databases) add per-call costs            │
│  • Embedding calls for RAG add ~$0.0001 per query                       │
│  • Compute for code execution, image processing                          │
│  • Agents typically make 3-5x more tool calls than workflows            │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

Cost Optimization Strategies

Code
┌─────────────────────────────────────────────────────────────────────────┐
│                    COST OPTIMIZATION TACTICS                             │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│  1. MODEL TIERING                                                        │
│     Use cheap models for simple steps, expensive for complex             │
│                                                                          │
│     ┌─────────────┐     ┌─────────────┐     ┌─────────────┐            │
│     │   Classify  │ ──→ │   Process   │ ──→ │  Generate   │            │
│     │  GPT-4o-mini│     │  GPT-4o-mini│     │   GPT-4o    │            │
│     │   $0.001    │     │   $0.002    │     │   $0.010    │            │
│     └─────────────┘     └─────────────┘     └─────────────┘            │
│                                                                          │
│     vs. GPT-4o for all steps: $0.013 vs $0.040 (3x cheaper)             │
│                                                                          │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                          │
│  2. CACHING                                                              │
│     Cache identical/similar requests                                     │
│                                                                          │
│     • Exact match cache: 100% token savings on hits                     │
│     • Semantic cache: Reuse similar query results                        │
│     • Tool result cache: Don't re-search same queries                   │
│                                                                          │
│     Typical cache hit rates:                                             │
│     • Customer support: 30-50%                                           │
│     • Search queries: 20-40%                                             │
│     • Code assistance: 10-20%                                            │
│                                                                          │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                          │
│  3. CONTEXT PRUNING                                                      │
│     Summarize or drop old context in agents                              │
│                                                                          │
│     Instead of keeping all tool results:                                 │
│     • Summarize after every 3 iterations                                 │
│     • Keep only last N tool results in full                              │
│     • Extract key facts, discard raw output                              │
│                                                                          │
│     Savings: 40-60% on long agent runs                                   │
│                                                                          │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                          │
│  4. EARLY TERMINATION                                                    │
│     Detect when agent is done or stuck                                   │
│                                                                          │
│     • Confidence scoring after each step                                 │
│     • Loop detection (same action twice = stuck)                         │
│     • "Good enough" thresholds                                           │
│                                                                          │
│     Average iterations: 12 → 7 with early termination                   │
│     Savings: ~40% on agent costs                                         │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

Latency Analysis: What Users Actually Experience

Cost matters for your budget. Latency matters for your users.

Latency Breakdown by Pattern

Code
┌─────────────────────────────────────────────────────────────────────────┐
│                    LATENCY BY PATTERN                                    │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│  Pattern              p50        p95        p99        Max              │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                          │
│  Single LLM Call      800ms      1.5s       3s         10s              │
│                                                                          │
│  3-Step Chain         2.4s       4.5s       8s         20s              │
│  (sequential)                                                            │
│                                                                          │
│  3-Step Chain         1.2s       2.5s       5s         12s              │
│  (parallel where possible)                                               │
│                                                                          │
│  Router + Handler     1.5s       3s         6s         15s              │
│                                                                          │
│  Agent (5 iter)       8s         15s        30s        60s              │
│                                                                          │
│  Agent (12 iter)      20s        40s        90s        180s             │
│                                                                          │
│  Multi-Agent          30s        60s        120s       300s             │
│                                                                          │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                          │
│  Note: Assumes GPT-4o. Claude similar. Local models 2-5x faster.        │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

Where Time Goes

Code
┌─────────────────────────────────────────────────────────────────────────┐
│                    LATENCY BREAKDOWN                                     │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│  SINGLE LLM CALL (800ms typical)                                        │
│  ───────────────────────────────                                         │
│                                                                          │
│  ████████████████████████████████████████░░░░░░░░░░░░░░░░░░░░          │
│  │         LLM Generation (650ms)         │  Network (150ms)  │         │
│                                                                          │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                          │
│  3-STEP WORKFLOW (2.4s typical)                                         │
│  ──────────────────────────────                                          │
│                                                                          │
│  ████████░░██████████░░████████████████░░░░                              │
│  │ Step 1 ││ Step 2   ││    Step 3      │Network│                       │
│  │ 600ms  ││ 700ms    ││    900ms       │ 200ms │                       │
│                                                                          │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                          │
│  AGENT - 5 ITERATIONS (8s typical)                                      │
│  ─────────────────────────────────                                       │
│                                                                          │
│  ████░████░████░████░████████░░░░░░░░░░░░░░░░░░░░░░░░                   │
│  │Iter│Iter│Iter│Iter│  Iter  │     Tool Execution    │                │
│  │ 1  │ 2  │ 3  │ 4  │   5    │       (2.5s)          │                │
│  │0.8s│1.0s│1.2s│1.3s│  1.5s  │                       │                │
│                                                                          │
│  Note: Each iteration is slower because context grows                   │
│                                                                          │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                          │
│  BREAKDOWN OF AGENT ITERATION TIME:                                      │
│                                                                          │
│  ┌─────────────────────────────────────────────────────────────────┐   │
│  │  Component              │  Time      │  % of Total              │   │
│  ├─────────────────────────┼────────────┼──────────────────────────│   │
│  │  LLM thinking           │  800ms     │  50%                     │   │
│  │  Tool execution         │  500ms     │  31%                     │   │
│  │  Network overhead       │  200ms     │  13%                     │   │
│  │  Parsing/processing     │  100ms     │  6%                      │   │
│  └─────────────────────────┴────────────┴──────────────────────────┘   │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

Latency Optimization Strategies

Code
┌─────────────────────────────────────────────────────────────────────────┐
│                    LATENCY OPTIMIZATION                                  │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│  1. STREAMING                                                            │
│     Show progress as it happens                                          │
│                                                                          │
│     Without streaming: User waits 8s, sees complete response             │
│     With streaming:    User sees first token at 200ms,                   │
│                        content streams over 8s                           │
│                                                                          │
│     Perceived latency: 8000ms → 200ms (40x improvement)                 │
│                                                                          │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                          │
│  2. PARALLEL TOOL CALLS                                                  │
│                                                                          │
│     Sequential:     Tool A (500ms) → Tool B (500ms) → Tool C (500ms)    │
│                     Total: 1500ms                                        │
│                                                                          │
│     Parallel:       Tool A ─┐                                            │
│                     Tool B ─┼── All complete in 500ms                   │
│                     Tool C ─┘                                            │
│                     Total: 500ms (3x faster)                             │
│                                                                          │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                          │
│  3. SPECULATIVE EXECUTION                                                │
│                                                                          │
│     While waiting for user confirmation:                                 │
│     • Pre-compute likely next steps                                      │
│     • Pre-fetch probable tool results                                    │
│     • Warm up embeddings/retrievals                                      │
│                                                                          │
│     If prediction correct: Near-instant response                         │
│     If prediction wrong: No worse than baseline                          │
│                                                                          │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                          │
│  4. MODEL SELECTION FOR SPEED                                            │
│                                                                          │
│     ┌─────────────────────────────────────────────────────────────┐    │
│     │  Model              │  Latency (p50)  │  Quality Trade-off  │    │
│     ├─────────────────────┼─────────────────┼─────────────────────│    │
│     │  GPT-4o             │  800ms          │  Best quality       │    │
│     │  GPT-4o-mini        │  400ms          │  Good for routing   │    │
│     │  Claude 3.5 Haiku   │  300ms          │  Fast + capable     │    │
│     │  Groq (Llama 70B)   │  150ms          │  Very fast          │    │
│     │  Local (Llama 8B)   │  100ms          │  Simple tasks only  │    │
│     └─────────────────────┴─────────────────┴─────────────────────┘    │
│                                                                          │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                          │
│  5. GRACEFUL DEGRADATION                                                 │
│                                                                          │
│     If latency exceeds threshold:                                        │
│     • Return partial results with "still working..."                     │
│     • Simplify the approach (fewer iterations)                           │
│     • Fall back to cached similar response                               │
│     • Offer to notify when complete                                      │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

Monitoring: How to Know If You Chose Wrong

You've deployed your system. How do you know if you made the right architecture choice?

Key Metrics to Track

Code
┌─────────────────────────────────────────────────────────────────────────┐
│                    MONITORING DASHBOARD                                  │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│  EFFICIENCY METRICS                                                      │
│  ──────────────────                                                      │
│                                                                          │
│  ┌─────────────────────────────────────────────────────────────────┐   │
│  │  Metric                    │  Healthy    │  Investigate         │   │
│  ├────────────────────────────┼─────────────┼──────────────────────│   │
│  │  Avg iterations (agent)    │  3-7        │  >10 consistently    │   │
│  │  Max iterations hit rate   │  <5%        │  >15%                │   │
│  │  Empty tool calls          │  <2%        │  >10%                │   │
│  │  Repeated actions          │  <5%        │  >15%                │   │
│  │  Cost per success          │  Stable     │  Trending up 20%+    │   │
│  └─────────────────────────────────────────────────────────────────┘   │
│                                                                          │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                          │
│  QUALITY METRICS                                                         │
│  ───────────────                                                         │
│                                                                          │
│  ┌─────────────────────────────────────────────────────────────────┐   │
│  │  Metric                    │  Healthy    │  Investigate         │   │
│  ├────────────────────────────┼─────────────┼──────────────────────│   │
│  │  Task success rate         │  >85%       │  <70%                │   │
│  │  User satisfaction         │  >4.0/5     │  <3.5/5              │   │
│  │  Escalation rate           │  <10%       │  >25%                │   │
│  │  Retry rate                │  <15%       │  >30%                │   │
│  │  Hallucination rate        │  <5%        │  >15%                │   │
│  └─────────────────────────────────────────────────────────────────┘   │
│                                                                          │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                          │
│  OPERATIONAL METRICS                                                     │
│  ───────────────────                                                     │
│                                                                          │
│  ┌─────────────────────────────────────────────────────────────────┐   │
│  │  Metric                    │  Healthy    │  Investigate         │   │
│  ├────────────────────────────┼─────────────┼──────────────────────│   │
│  │  p95 latency               │  <SLA       │  >SLA                │   │
│  │  Error rate                │  <2%        │  >5%                 │   │
│  │  Timeout rate              │  <1%        │  >5%                 │   │
│  │  Cost variance             │  ±20%       │  >50% spikes         │   │
│  │  Tool failure rate         │  <5%        │  >15%                │   │
│  └─────────────────────────────────────────────────────────────────┘   │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

Signs You Chose Wrong

Code
┌─────────────────────────────────────────────────────────────────────────┐
│              ARCHITECTURE MISMATCH SIGNALS                               │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│  YOU BUILT A WORKFLOW BUT SHOULD HAVE BUILT AN AGENT IF:                │
│  ─────────────────────────────────────────────────────────               │
│                                                                          │
│  • High rate of "I don't have enough information" responses             │
│  • Users frequently need to rephrase/retry queries                       │
│  • Edge cases keep requiring new workflow branches                       │
│  • Success rate varies wildly by input type                              │
│  • You're adding if/else branches weekly                                 │
│                                                                          │
│  Pattern in logs:                                                        │
│  ┌────────────────────────────────────────────────────────────────┐    │
│  │  Query: "Find pricing for competitor X and compare to us"       │    │
│  │  Result: FAILED - No handler for comparative analysis           │    │
│  │                                                                  │    │
│  │  Query: "Why did the deployment fail yesterday?"                 │    │
│  │  Result: FAILED - Requires multi-step investigation             │    │
│  └────────────────────────────────────────────────────────────────┘    │
│                                                                          │
│  → Consider: Adding agent escape hatch for complex queries              │
│                                                                          │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                          │
│  YOU BUILT AN AGENT BUT SHOULD HAVE BUILT A WORKFLOW IF:                │
│  ─────────────────────────────────────────────────────────               │
│                                                                          │
│  • 90%+ of runs follow the same tool sequence                            │
│  • Agent rarely uses more than 2-3 tools                                 │
│  • Iteration count is consistently low (2-3)                             │
│  • Most "thinking" is repetitive/unnecessary                             │
│  • Cost per query is 10x higher than it needs to be                     │
│                                                                          │
│  Pattern in logs:                                                        │
│  ┌────────────────────────────────────────────────────────────────┐    │
│  │  Run 1: search → format → respond (3 iterations)                │    │
│  │  Run 2: search → format → respond (3 iterations)                │    │
│  │  Run 3: search → format → respond (3 iterations)                │    │
│  │  Run 4: search → format → respond (3 iterations)                │    │
│  │  ...                                                             │    │
│  │  (Same pattern in 94% of runs)                                   │    │
│  └────────────────────────────────────────────────────────────────┘    │
│                                                                          │
│  → Consider: Converting to 3-step workflow (3x cheaper, 2x faster)     │
│                                                                          │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                          │
│  YOU'RE OVER-ENGINEERING IF:                                             │
│  ──────────────────────────                                              │
│                                                                          │
│  • Built multi-agent but tasks don't require handoffs                   │
│  • Most "routing" goes to one handler (>80%)                            │
│  • Complexity adds latency without improving success rate               │
│  • Team spends more time debugging orchestration than improving core    │
│                                                                          │
│  → Consider: Simplifying to single agent or workflow                    │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

Continuous Optimization Loop

Code
┌─────────────────────────────────────────────────────────────────────────┐
│              OPTIMIZATION FEEDBACK LOOP                                  │
│                                                                          │
│                         ┌─────────────┐                                  │
│                         │   Deploy    │                                  │
│                         └──────┬──────┘                                  │
│                                │                                         │
│                                ▼                                         │
│                         ┌─────────────┐                                  │
│         ┌───────────────│   Monitor   │───────────────┐                 │
│         │               └──────┬──────┘               │                 │
│         │                      │                      │                 │
│         ▼                      ▼                      ▼                 │
│  ┌─────────────┐        ┌─────────────┐       ┌─────────────┐          │
│  │Cost spikes? │        │Quality drop?│       │Latency SLA? │          │
│  └──────┬──────┘        └──────┬──────┘       └──────┬──────┘          │
│         │                      │                      │                 │
│         ▼                      ▼                      ▼                 │
│  ┌─────────────┐        ┌─────────────┐       ┌─────────────┐          │
│  │ Analyze     │        │ Analyze     │       │ Analyze     │          │
│  │ token usage │        │ failure     │       │ bottlenecks │          │
│  │ patterns    │        │ patterns    │       │             │          │
│  └──────┬──────┘        └──────┬──────┘       └──────┬──────┘          │
│         │                      │                      │                 │
│         └──────────────────────┼──────────────────────┘                 │
│                                │                                         │
│                                ▼                                         │
│                         ┌─────────────┐                                  │
│                         │  Decision   │                                  │
│                         └──────┬──────┘                                  │
│                                │                                         │
│         ┌──────────────────────┼──────────────────────┐                 │
│         ▼                      ▼                      ▼                 │
│  ┌─────────────┐        ┌─────────────┐       ┌─────────────┐          │
│  │  Simplify   │        │   Enhance   │       │  Optimize   │          │
│  │  (workflow) │        │   (agent)   │       │  (hybrid)   │          │
│  └─────────────┘        └─────────────┘       └─────────────┘          │
│                                                                          │
│  Weekly review cadence recommended for active systems                   │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

Migration Strategies

Sometimes you need to change approaches. Here's how to migrate safely.

Workflow → Agent Migration

When your workflow can't handle the complexity anymore:

Code
┌─────────────────────────────────────────────────────────────────────────┐
│              WORKFLOW → AGENT MIGRATION                                  │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│  PHASE 1: IDENTIFY CANDIDATES                                            │
│  ────────────────────────────                                            │
│                                                                          │
│  Look for:                                                               │
│  • Steps that frequently fail or need retries                            │
│  • Branches that keep multiplying                                        │
│  • User complaints about "it doesn't understand me"                      │
│  • High variance in what the step needs to do                            │
│                                                                          │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                          │
│  PHASE 2: HYBRID FIRST                                                   │
│  ─────────────────────                                                   │
│                                                                          │
│  Don't replace the whole workflow. Add agent escape hatches:            │
│                                                                          │
│  BEFORE:                                                                 │
│  ┌─────────┐     ┌─────────┐     ┌─────────┐                            │
│  │ Step 1  │ ──→ │ Step 2  │ ──→ │ Step 3  │                            │
│  └─────────┘     └─────────┘     └─────────┘                            │
│                       │                                                  │
│                       ▼ (fails 20% of time)                             │
│                     ERROR                                                │
│                                                                          │
│  AFTER:                                                                  │
│  ┌─────────┐     ┌─────────┐     ┌─────────┐                            │
│  │ Step 1  │ ──→ │ Step 2  │ ──→ │ Step 3  │                            │
│  └─────────┘     └────┬────┘     └─────────┘                            │
│                       │                                                  │
│                       ▼ (if confidence < 0.8)                           │
│                  ┌─────────┐                                             │
│                  │  Agent  │ (handles edge cases)                       │
│                  │ Fallback│                                             │
│                  └─────────┘                                             │
│                                                                          │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                          │
│  PHASE 3: GRADUAL PROMOTION                                              │
│  ──────────────────────────                                              │
│                                                                          │
│  Week 1: 5% traffic to agent path (shadow mode)                         │
│  Week 2: Compare metrics, fix issues                                     │
│  Week 3: 20% traffic to agent path                                       │
│  Week 4: 50% traffic                                                     │
│  Week 5: 100% traffic to agent                                           │
│                                                                          │
│  Rollback triggers:                                                      │
│  • Success rate drops >10%                                               │
│  • p95 latency exceeds 2x baseline                                       │
│  • Cost exceeds 3x baseline                                              │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

Agent → Workflow Migration

When you've over-engineered and need to simplify:

Code
┌─────────────────────────────────────────────────────────────────────────┐
│              AGENT → WORKFLOW MIGRATION                                  │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│  PHASE 1: ANALYZE TRACES                                                 │
│  ───────────────────────                                                 │
│                                                                          │
│  Collect 1000+ agent traces and analyze:                                │
│                                                                          │
│  • What tool sequences are most common?                                  │
│  • What % of runs follow predictable patterns?                           │
│  • Where does the agent actually need to "think"?                        │
│                                                                          │
│  Example analysis:                                                       │
│  ┌────────────────────────────────────────────────────────────────┐    │
│  │  Pattern                              │  Frequency             │    │
│  ├───────────────────────────────────────┼────────────────────────│    │
│  │  search → read → respond              │  45%                   │    │
│  │  search → search → read → respond     │  25%                   │    │
│  │  read → respond                       │  15%                   │    │
│  │  (complex/varied)                     │  15%                   │    │
│  └────────────────────────────────────────────────────────────────┘    │
│                                                                          │
│  85% follows 3 patterns → Strong workflow candidate                     │
│                                                                          │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                          │
│  PHASE 2: BUILD PARALLEL WORKFLOW                                        │
│  ────────────────────────────────                                        │
│                                                                          │
│  Create workflow that handles the common patterns:                       │
│                                                                          │
│  ┌─────────────────────────────────────────────────────────────────┐   │
│  │                                                                  │   │
│  │                    ┌──────────────┐                              │   │
│  │                    │   Classify   │                              │   │
│  │                    │    Query     │                              │   │
│  │                    └───────┬──────┘                              │   │
│  │                            │                                     │   │
│  │          ┌─────────────────┼─────────────────┐                  │   │
│  │          ▼                 ▼                 ▼                   │   │
│  │    ┌──────────┐     ┌──────────┐     ┌──────────┐               │   │
│  │    │ Pattern  │     │ Pattern  │     │  Send to │               │   │
│  │    │    A     │     │    B     │     │  Agent   │               │   │
│  │    │ (45%)    │     │ (40%)    │     │  (15%)   │               │   │
│  │    └──────────┘     └──────────┘     └──────────┘               │   │
│  │                                                                  │   │
│  └─────────────────────────────────────────────────────────────────┘   │
│                                                                          │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                          │
│  PHASE 3: SHADOW TESTING                                                 │
│  ───────────────────────                                                 │
│                                                                          │
│  Run both in parallel, compare outputs:                                  │
│                                                                          │
│  Request ──┬──→ Agent (current) ───────→ Return to user                │
│            │                                                             │
│            └──→ Workflow (shadow) ──→ Log & compare                     │
│                                                                          │
│  Measure:                                                                │
│  • Output equivalence rate                                               │
│  • Cost difference                                                       │
│  • Latency difference                                                    │
│  • Cases where workflow fails but agent succeeds                         │
│                                                                          │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                          │
│  PHASE 4: GRADUAL CUTOVER                                                │
│  ────────────────────────                                                │
│                                                                          │
│  Same gradual rollout as before (5% → 20% → 50% → 100%)                │
│  Keep agent as fallback for the 15% complex cases                       │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

Framework Selection Guide

Different frameworks have different strengths for workflows vs agents.

Code
┌─────────────────────────────────────────────────────────────────────────┐
│                    FRAMEWORK COMPARISON                                  │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│  ┌───────────────────────────────────────────────────────────────────┐ │
│  │  Framework      │ Workflows │ Agents │ Best For                   │ │
│  ├─────────────────┼───────────┼────────┼────────────────────────────│ │
│  │                 │           │        │                            │ │
│  │  LangChain      │    ★★★    │  ★★☆   │ Quick prototypes, many     │ │
│  │  (LCEL)         │           │        │ integrations               │ │
│  │                 │           │        │                            │ │
│  │  LangGraph      │    ★★★    │  ★★★   │ Complex state machines,    │ │
│  │                 │           │        │ cycles, human-in-loop      │ │
│  │                 │           │        │                            │ │
│  │  LlamaIndex     │    ★★☆    │  ★★☆   │ RAG-heavy applications,    │ │
│  │                 │           │        │ document processing        │ │
│  │                 │           │        │                            │ │
│  │  CrewAI         │    ★☆☆    │  ★★★   │ Multi-agent systems,       │ │
│  │                 │           │        │ role-based agents          │ │
│  │                 │           │        │                            │ │
│  │  AutoGen        │    ★☆☆    │  ★★★   │ Conversational agents,     │ │
│  │                 │           │        │ agent collaboration        │ │
│  │                 │           │        │                            │ │
│  │  Semantic       │    ★★★    │  ★★☆   │ Enterprise, .NET/C#,       │ │
│  │  Kernel         │           │        │ Microsoft ecosystem        │ │
│  │                 │           │        │                            │ │
│  │  Haystack       │    ★★★    │  ★☆☆   │ Production RAG,            │ │
│  │                 │           │        │ pipeline-first             │ │
│  │                 │           │        │                            │ │
│  │  Custom         │    ★★★    │  ★★★   │ Full control, minimal      │ │
│  │  (no framework) │           │        │ dependencies               │ │
│  │                 │           │        │                            │ │
│  └───────────────────────────────────────────────────────────────────┘ │
│                                                                          │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                          │
│  DECISION GUIDE:                                                         │
│                                                                          │
│  "I need a simple 3-step workflow"                                       │
│  → LangChain LCEL or custom code                                        │
│                                                                          │
│  "I need workflows with complex branching and state"                     │
│  → LangGraph                                                             │
│                                                                          │
│  "I need a ReAct agent with tools"                                       │
│  → LangGraph, CrewAI, or custom                                         │
│                                                                          │
│  "I need multiple agents working together"                               │
│  → CrewAI, AutoGen, or LangGraph                                        │
│                                                                          │
│  "I need production RAG pipelines"                                       │
│  → LlamaIndex, Haystack                                                 │
│                                                                          │
│  "I need full control and minimal magic"                                │
│  → Custom implementation                                                 │
│                                                                          │
│  "I'm in a Microsoft/enterprise environment"                             │
│  → Semantic Kernel                                                       │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

Framework-Specific Patterns

Code
┌─────────────────────────────────────────────────────────────────────────┐
│                    PATTERN IMPLEMENTATION BY FRAMEWORK                   │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│  SIMPLE CHAIN                                                            │
│  ────────────                                                            │
│                                                                          │
│  LangChain:  prompt | llm | parser                                      │
│  LangGraph:  StateGraph with linear edges                               │
│  Custom:     for step in steps: result = step(result)                   │
│                                                                          │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                          │
│  ROUTER                                                                  │
│  ──────                                                                  │
│                                                                          │
│  LangChain:  RunnableBranch with conditions                             │
│  LangGraph:  Conditional edges based on state                           │
│  Custom:     if/elif with classification step                           │
│                                                                          │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                          │
│  REACT AGENT                                                             │
│  ───────────                                                             │
│                                                                          │
│  LangChain:  create_react_agent()                                       │
│  LangGraph:  prebuilt.create_react_agent() or custom graph              │
│  CrewAI:     Agent with tools                                           │
│  Custom:     While loop with tool execution                             │
│                                                                          │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                          │
│  HUMAN-IN-THE-LOOP                                                       │
│  ─────────────────                                                       │
│                                                                          │
│  LangChain:  Callbacks (limited)                                        │
│  LangGraph:  interrupt() + Command pattern (native support)             │
│  Custom:     State persistence + resume logic                           │
│                                                                          │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                          │
│  MULTI-AGENT                                                             │
│  ───────────                                                             │
│                                                                          │
│  LangChain:  Manual orchestration                                       │
│  LangGraph:  Subgraphs + supervisor pattern                             │
│  CrewAI:     Crew with multiple Agents (native)                         │
│  AutoGen:    GroupChat, native multi-agent                              │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

Implementation Checklist

When designing your system, walk through this checklist:

Code
┌─────────────────────────────────────────────────────────────────────────┐
│                    IMPLEMENTATION CHECKLIST                              │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│  STEP 1: Understand Your Task                                            │
│  ─────────────────────────────                                           │
│  □ Can I enumerate the steps needed?                                     │
│  □ Do the steps change based on intermediate results?                    │
│  □ Is this task exploratory or procedural?                               │
│  □ What's the worst-case number of steps?                                │
│                                                                          │
│  STEP 2: Assess Your Constraints                                         │
│  ──────────────────────────────                                          │
│  □ What's my latency budget?                                             │
│  □ What's my cost budget per request?                                    │
│  □ Do I need deterministic behavior?                                     │
│  □ What are my compliance/audit requirements?                            │
│                                                                          │
│  STEP 3: Start Simple                                                    │
│  ───────────────────                                                     │
│  □ Try single LLM call first                                             │
│  □ Identify where it fails                                               │
│  □ Add complexity only for failure cases                                 │
│  □ Document why each step exists                                         │
│                                                                          │
│  STEP 4: Add Guardrails                                                  │
│  ─────────────────────                                                   │
│  □ Set max iterations for any loops                                      │
│  □ Set token/cost budgets                                                │
│  □ Add timeouts                                                          │
│  □ Plan graceful degradation                                             │
│                                                                          │
│  STEP 5: Design for Observability                                        │
│  ────────────────────────────────                                        │
│  □ Log all LLM calls with inputs/outputs                                 │
│  □ Track tokens, latency, costs                                          │
│  □ Trace execution path                                                  │
│  □ Alert on anomalies (high iteration counts, etc.)                      │
│                                                                          │
│  STEP 6: Plan for Evolution                                              │
│  ──────────────────────────                                              │
│  □ Make it easy to promote workflow steps to agents                      │
│  □ Make it easy to demote agent tasks to workflows                       │
│  □ Track which requests actually need agent flexibility                  │
│  □ Regularly review and simplify                                         │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

Summary: The Right Tool for the Job

Code
┌─────────────────────────────────────────────────────────────────────────┐
│                         KEY TAKEAWAYS                                    │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│  1. DEFAULT TO WORKFLOWS                                                 │
│     Most production AI systems should be workflows.                      │
│     They're cheaper, faster, and more predictable.                       │
│                                                                          │
│  2. USE AGENTS FOR THE UNKNOWN                                           │
│     Agents shine when you genuinely don't know the                       │
│     steps in advance. Research, debugging, exploration.                  │
│                                                                          │
│  3. HYBRID IS USUALLY BEST                                               │
│     Workflow with agent escape hatches, or agent                         │
│     orchestrating workflows. Get the best of both.                       │
│                                                                          │
│  4. EARN COMPLEXITY                                                      │
│     Start simple. Add complexity only when you have                      │
│     evidence that simpler approaches fail.                               │
│                                                                          │
│  5. ALWAYS BOUND AGENTS                                                  │
│     Max iterations, token budgets, timeouts, cost limits.                │
│     Unbounded agents will eventually surprise you.                       │
│                                                                          │
│  6. MEASURE AND ITERATE                                                  │
│     Track what percentage of requests actually need                      │
│     agent flexibility. Optimize the common path.                         │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

The goal isn't to use the most sophisticated architecture—it's to use the simplest architecture that solves your problem reliably.


References & Further Reading

Foundational Resources

  • Building Effective Agents - Anthropic's guide that popularized the workflow vs agent distinction. Essential reading.
  • What is Agentic AI? - LangChain's overview of agentic patterns and when to use them.
  • ReAct: Synergizing Reasoning and Acting in Language Models - The original paper introducing the ReAct pattern.

Framework Documentation

  • LangGraph Conceptual Guide - Understanding state machines and agent loops in LangGraph.
  • LlamaIndex Workflows - LlamaIndex's approach to DAG-based workflows.
  • CrewAI Documentation - Multi-agent orchestration patterns.
  • AutoGen - Microsoft's multi-agent conversation framework.

Research & Deep Dives

  • The Landscape of Emerging AI Agent Architectures - Academic survey of agent design patterns.
  • Cognitive Architectures for Language Agents - CoALA framework for understanding agent designs.
  • Reflexion: Language Agents with Verbal Reinforcement Learning - Self-correction patterns for agents.
  • Tree of Thoughts - Deliberate problem-solving with LLMs.

Practical Guides

  • OpenAI Function Calling Guide - Tool use fundamentals.
  • Anthropic Tool Use Guide - Claude's approach to tool calling.
  • Prompt Caching - Reducing costs in multi-turn agent interactions.

Frequently Asked Questions

Enrico Piovano, PhD

Co-founder & CTO at Goji AI. Former Applied Scientist at Amazon (Alexa & AGI), focused on Agentic AI and LLMs. PhD in Electrical Engineering from Imperial College London. Gold Medalist at the National Mathematical Olympiad.

Related Articles

EducationAgentic AI

Building Agentic AI Systems: A Complete Implementation Guide

A comprehensive guide to building AI agents—tool use, ReAct pattern, planning, memory, context management, MCP integration, and multi-agent orchestration. With full prompt examples and production patterns.

30 min read
LLMsML Engineering

LLM Frameworks: LangChain, LlamaIndex, LangGraph, and Beyond

A comprehensive comparison of LLM application frameworks—LangChain, LlamaIndex, LangGraph, Haystack, and alternatives. When to use each, how to combine them, and practical implementation patterns.

13 min read
LLMsAgentic AI

Structured Outputs and Tool Use: Patterns for Reliable AI Applications

Master structured output generation and tool use patterns—JSON mode, schema enforcement, Instructor library, function calling best practices, error handling, and production patterns for reliable AI applications.

8 min read
EducationAgentic AI

Building Customer Support Agents: A Production Architecture Guide

A comprehensive guide to building multi-agent customer support systems—triage routing, specialized agents, context handoffs, guardrails, and production patterns with full implementation examples.

13 min read
Agentic AIML Engineering

Agent Evaluation and Testing: From Development to Production

A comprehensive guide to evaluating AI agents—task success metrics, trajectory analysis, tool use correctness, sandboxing, and building robust testing pipelines for production agent systems.

11 min read
Agentic AIML Engineering

AI Agent Economics: Unit Costs, ROI Frameworks, and Cost Optimization

A comprehensive framework for calculating AI agent costs, understanding reasoning token economics, optimizing spend with model cascading, and building ROI models for agentic systems.

6 min read
Agentic AIUX Design

Human-in-the-Loop UX: Designing Control Surfaces for AI Agents

Design patterns for human oversight of AI agents—pause mechanisms, approval workflows, progressive autonomy, and the UX of agency. How to build systems where humans stay in control.

5 min read
Agentic AIRAG

Agentic RAG: When Retrieval Meets Autonomous Reasoning

How to build RAG systems that don't just retrieve—they reason, plan, and iteratively refine their searches to solve complex information needs.

9 min read
LLMsPersonalization

LLM Memory Systems: From MemGPT to Long-Term Agent Memory

Understanding memory architectures for LLM agents—MemGPT's hierarchical memory, Letta's agent framework, and patterns for building agents that learn and remember across conversations.

30 min read
Agentic AILLMs

The Rise of Agentic AI: Understanding MCP and A2A Protocols

An exploration of the emerging protocols enabling AI agents to communicate and collaborate, including Model Context Protocol (MCP) and Agent-to-Agent (A2A) communication.

10 min read