Back to writings

Advanced Prompt Engineering Patterns for Multi-Agent Workflows

9 min read

I've built dozens of AI agents. Most of them fail in production not because the models aren't capable, but because I didn't architect the prompts correctly for multi-agent coordination.

The difference between a demo agent and a production agent swarm comes down to one thing: how you engineer the prompts to pass work between specialists.

This post covers the patterns I've developed for building AI agents that actually work together.

Why Standard Prompt Engineering Breaks at Scale

When you're working with a single agent, prompt engineering is straightforward: give it context, tell it what to do, parse the output.

Multi-agent workflows are different. Now you're not just writing prompts—you're designing the communication protocol between specialized workers. The primary technical challenge becomes designing the sophisticated workflows and interaction protocols between multiple specialized agents.

The problems emerge quickly:

  • Agent A finishes its work but Agent B doesn't know what changed
  • Agent C receives context from A and B but can't figure out which instruction applies
  • One agent fails silently, and the whole chain stalls
  • You end up with duplicate work because agents don't know what others have done

Standard prompt engineering patterns don't handle this. You need something more structured. As I explore in Why Prompt Engineering Won't Fix Your AI Agent Architecture, prompting alone has limits—architecture matters more.

Pattern 1: Hand-Off Prompts for Agent-to-Agent Communication

The core pattern I use is what I call hand-off prompts. Instead of each agent operating independently, specialized agents pass data seamlessly to one another through hand-off prompts. For example, a Researcher Agent passes structured JSON data to a Creative Director Agent.

Here's how it works in practice:

Agent A (Research) completes its work and creates a structured output:

{
  "task_id": "market-analysis-001",
  "findings": {
    "market_size": "$2.4B",
    "growth_rate": "18% YoY",
    "key_players": ["Company X", "Company Y"],
    "gaps": ["Low-cost segment", "Enterprise features"]
  },
  "confidence_score": 0.92,
  "sources_consulted": 3,
  "next_agent": "strategy_planner",
  "handoff_notes": "Market is underserved in enterprise. Recommend focusing on integration capabilities."
}

Agent B (Strategy) receives this and knows exactly what it's working with:

You are the Strategy Planner. You've received research findings from the Research Agent.

RESEARCH FINDINGS:
{market analysis JSON}

Your task: Based on these specific findings, develop a go-to-market strategy.

CONSTRAINTS:
- Address the identified gaps
- Consider the confidence level (92%)
- Reference specific sources when relevant
- Output as a structured strategy document

SUCCESS CRITERIA:
- Strategy directly addresses market gaps
- Includes 3+ specific recommendations
- Cites the research findings

The key difference: the prompt explicitly tells Agent B what it received, where it came from, and how to use it. No guessing. No lost context.

Pattern 2: Context Pinning for Long-Running Workflows

In multi-agent systems, context management becomes critical. Context pinning strategically locks essential data (like API schemas or brand bibles) into the model's high-density context window to ensure consistent performance over long-duration tasks.

I pin three categories of context:

1. System Context — Rules that never change:

SYSTEM CONTEXT (PINNED):
- All dates use ISO 8601 format
- Currency is always USD
- All JSON must validate against provided schemas
- Never invent data; flag unknowns instead
- Maximum response time per agent: 30 seconds

2. Workflow Context — The current state of the workflow:

WORKFLOW STATE (PINNED):
Current Stage: 3 of 5 (Implementation)
Completed: Requirements → Design → Architecture
Pending: Implementation → Testing → Deployment

Previous Agent Output: [summary of what was done]
Handoff Metadata: {timestamp, agent_id, status}

3. Domain Context — Specialized knowledge:

DOMAIN CONTEXT (PINNED):
API Schema: [OpenAPI spec]
Data Models: [TypeScript interfaces]
Error Codes: [mapping of codes to meanings]
Brand Voice: [guidelines for tone and terminology]

By pinning this context, every agent in the workflow starts with the same foundation. They don't re-derive rules or re-learn the domain. They build on what came before.

Pattern 3: Reflexive Verification for Error Handling

Single agents can fail silently. Multi-agent systems need to catch failures before they cascade. Reflexive verification instructs agents to pause and audit their own output against a "Golden Set" of success criteria before proceeding to the next step of a workflow.

Here's the pattern:

VERIFICATION PHASE (REQUIRED):
Before passing work to the next agent, verify:

1. Output Format Check
   ✓ JSON is valid
   ✓ All required fields present
   ✓ Types match schema

2. Content Check
   ✓ No contradictions with previous work
   ✓ All references are resolved
   ✓ No placeholder text remaining

3. Success Criteria Check
   ✓ [Specific criterion 1]
   ✓ [Specific criterion 2]
   ✓ [Specific criterion 3]

If ANY check fails:
- Identify the specific failure
- Attempt self-correction
- If uncorrectable, escalate with detailed error report

DO NOT proceed to next agent if verification fails.

This prevents garbage-in-garbage-out. Each agent is responsible for validating its own work before handing off.

Pattern 4: Prompt Chaining with Conditional Routing

Not all work flows linearly. Sometimes the output of one agent determines which agent should run next. A routing prompt classifies the input and directs it to one of several specialized sub-prompts. For example, routing a customer query to the Billings prompt versus the Technical prompt.

I implement this with a Router Agent that runs between stages:

You are the Router. Your job is to analyze the current work state
and determine which specialized agent should work next.

CURRENT STATE: {workflow state}

AVAILABLE AGENTS:
- Compliance Agent: For regulatory/legal concerns
- Performance Agent: For optimization work
- Security Agent: For security-related changes
- Architecture Agent: For structural changes

ROUTING LOGIC:
If output contains regulatory keywords → Compliance Agent
If output mentions performance issues → Performance Agent
If output mentions security → Security Agent
If output affects system structure → Architecture Agent

Analyze the work and route to the appropriate agent.
Return: { "next_agent": "...", "reasoning": "...", "context": {...} }

This keeps the workflow dynamic. Agents don't need to know about each other—the router handles orchestration.

Pattern 5: Parallelization with Result Aggregation

Sometimes you want multiple agents working simultaneously on independent aspects. Parallelization runs multiple prompts at once and then aggregates their results into a final output.

The pattern:

ORCHESTRATOR PROMPT:
You are managing three parallel agents working on different aspects:
- Agent A: Frontend architecture
- Agent B: Backend API design
- Agent C: Database schema

They are working in parallel. Your job:
1. Dispatch work to each agent
2. Collect their outputs
3. Identify conflicts or dependencies
4. Aggregate into a coherent design

CONFLICT RESOLUTION:
If agents disagree on a boundary (e.g., API response format):
- Flag the specific conflict
- Note which agent has the stronger argument
- Propose a resolution
- Document the decision

Return the aggregated design with conflict notes.

This works well for independent work streams (frontend and backend, for example). The orchestrator ensures consistency.

Pattern 6: Error Recovery and Escalation

Even with all these patterns, things will fail. You need a graceful degradation strategy.

ERROR HANDLING PROTOCOL:

Level 1 - Self-Correct (Agent attempts fix):
- Identify specific error
- Attempt correction
- Re-verify output
- If successful, proceed

Level 2 - Fallback (Use alternative approach):
- If self-correction fails, try alternative method
- Example: If JSON parsing fails, return as structured text
- Document the fallback used

Level 3 - Escalate (Human intervention required):
- If both attempts fail, escalate with:
  * Exact error message
  * What was attempted
  * Context for human review
  * Suggested next steps

DO NOT silently fail or retry infinitely.

This prevents your workflow from hanging. Failures are surfaced explicitly. For deeper context on escalation patterns, see Human-in-the-Loop Systems: Designing Intervention Points for AI Automation.

Putting It Together: A Real Example

Let me show how these patterns work together in a real workflow I built—a document analysis system that processes contracts.

Stage 1: Document Ingestion

  • Input: Raw contract text
  • Agent: Document Cleaner
  • Output: Structured document with metadata
  • Verification: Format check, completeness check

Stage 2: Parallel Analysis

  • Router directs to three agents in parallel:
    • Clause Extractor: Identifies key clauses
    • Risk Analyzer: Flags problematic language
    • Compliance Checker: Validates against regulations
  • Each agent uses pinned domain context (legal terminology, clause types, regulations)
  • Each agent performs reflexive verification

Stage 3: Aggregation

  • Orchestrator collects outputs from three agents
  • Resolves conflicts (e.g., one agent flags risk, another says it's standard)
  • Creates unified analysis document
  • Hands off to next stage

Stage 4: Summary Generation

  • Agent: Executive Summarizer
  • Input: Full analysis and metadata
  • Output: One-page summary for stakeholder review
  • Verification: Accuracy check against source analysis

Stage 5: Escalation (if needed)

  • If any verification fails, escalate with full context
  • Human reviews the failure point
  • Provides correction or guidance

This workflow runs reliably because each stage knows exactly what it's receiving, what it's responsible for, and what comes next. Agents don't guess. They follow the protocol.

Key Takeaways

Building prompt engineering for multi-agent workflows is fundamentally different from single-agent prompting. You're not just writing better instructions—you're designing a system where specialized workers can coordinate reliably.

The patterns that matter:

  1. Hand-off prompts — Explicit structured communication between agents
  2. Context pinning — Shared foundational knowledge every agent uses
  3. Reflexive verification — Self-checking before handoff
  4. Conditional routing — Dynamic workflow based on output
  5. Parallelization — Independent work streams with aggregation
  6. Error recovery — Graceful degradation with escalation

Start with hand-off prompts and context pinning. Those two alone will fix most coordination issues. Add verification and routing as your workflows get more complex.

The biggest mistake I see: teams try to make agents "smarter" when they should be making workflows "clearer." A dumb agent following a clear protocol beats a smart agent in a confusing system.

These patterns will save you weeks of debugging when you're building multi-agent systems. The transition from experimentation to production-grade system-building requires shifting focus from "Can models do this?" to "What does it take to run this reliably, safely, and at scale?"

Related Reading

If you're building production AI agents, you'll want to understand the broader architectural context:

Want to discuss your specific multi-agent architecture? Get in touch—I help teams design workflows that actually work in production.