All posts
2026-03-168 min

Which AI Model for Which Agent? How to Choose Smartly

LLMCost OptimizationConfigurationMulti-AgentOpenClaw

The Problem With "Best Model for Everything"

When we built our first multi-agent setup, the configuration was simple: all agents on Claude Opus. The strongest, the most expensive — better safe than sorry.

Then the API invoice came after week one. For six agents running around the clock, the number was... significantly higher than expected.

The realization: Not every task needs the most powerful model. When Alex checks my calendar and tells me whether I have a meeting tomorrow — that doesn't need 200-token-per-second intelligence. But Peter, our coding agent, reviewing complex TypeScript architectures — he genuinely needs the best available.

The result of our overhaul: 60% fewer API costs with equal or better quality.

---

The Core Idea: Classify Your Tasks

Before assigning models, you need to understand what each agent actually does. We split our agents into three categories:

Category 1: Reasoning-Intensive Tasks

These tasks require deep thinking, multi-step inference, code analysis, or creative quality work.

Examples:

  • Code reviews with architectural context
  • Complex research with synthesis
  • Strategic content creation (blog posts, proposals)
  • Error diagnosis in complex logs
  • Recommended models: Claude Opus 4.5+, GPT-4o, Gemini Ultra

    Category 2: Structured, Rule-Based Tasks

    These tasks follow clear patterns. Input is structured, output is predictable, error rates are low.

    Examples:

  • Calendar queries and meeting reminders
  • Email classification (important / not important / spam)
  • Simple webhook triggers and API calls
  • Daily status summaries from structured data
  • Recommended models: Claude Sonnet 4.5, GPT-4o Mini, Gemini Flash

    Category 3: Simple Execution

    Tasks where the model mostly acts as an interface: receive command, call tool, return result.

    Examples:

  • Read and summarize files
  • Forward simple database queries
  • Heartbeat checks (is the server reachable?)
  • Route notifications
  • Recommended models: Claude Haiku, GPT-4o Mini, Gemini Flash 8B

    ---

    How to Configure Models in OpenClaw

    OpenClaw lets you set the model per agent in `~/.openclaw/openclaw.json` (or `openclaw.json5`). The configuration looks like this:

    ```json

    {

    "agents": {

    "sam": {

    "model": "anthropic/claude-opus-4-5",

    "workspace": "/home/sam/.openclaw/workspace"

    },

    "peter": {

    "model": "anthropic/claude-opus-4-5",

    "workspace": "/home/peter/.openclaw/workspace"

    },

    "maya": {

    "model": "anthropic/claude-sonnet-4-5",

    "workspace": "/home/maya/.openclaw/workspace"

    },

    "alex": {

    "model": "anthropic/claude-haiku-3-5",

    "workspace": "/home/alex/.openclaw/workspace"

    },

    "iris": {

    "model": "anthropic/claude-sonnet-4-5",

    "workspace": "/home/iris/.openclaw/workspace"

    },

    "atlas": {

    "model": "anthropic/claude-opus-4-5",

    "workspace": "/home/atlas/.openclaw/workspace"

    }

    }

    }

    ```

    Alternatively, you can set the model per agent via an environment variable in the Docker Compose file:

    ```yaml

    services:

    alex:

    image: openclaw/agent:latest

    environment:

    - OPENCLAW_MODEL=anthropic/claude-haiku-3-5

    - OPENCLAW_AGENT_NAME=alex

    volumes:

    - /home/alex/.openclaw/workspace:/workspace

    ```

    Both methods work. We prefer the JSON configuration because it documents all agents centrally.

    ---

    Our Real Setup: The 6-Agent Model Matrix

    Here's exactly what we run — no theory, the actual setup:

    | Agent | Role | Model | Reason |

    |-------|------|-------|--------|

    | Sam (Team Lead) | Delegation, planning, blog | Claude Opus 4.5 | Complex coordination, creative writing |

    | Peter (Coding) | PR reviews, tests, bugs | Claude Opus 4.5 | Architecture understanding, reasoning |

    | Maya (Marketing) | Copy, campaigns, SEO | Claude Sonnet 4.5 | Good quality, 3× cheaper than Opus |

    | Alex (Admin) | Calendar, reminders, tasks | Claude Haiku 3.5 | Structured tasks, no depth needed |

    | Iris (Research) | Research, synthesis, reports | Claude Sonnet 4.5 | Reasoning + cost balance |

    | Atlas (CEO Support) | Strategy, reports, letters | Claude Opus 4.5 | CEO output must be flawless |

    Cost comparison (estimated, moderate usage):

  • Before the change (all Opus): ~€380/month
  • After the change (mixed models): ~€145/month
  • Savings: ~62%
  • ---

    How to Tell When an Agent Has the Wrong Model

    Signs of "model too weak":

  • Responses get shorter and flatter than expected
  • The agent asks for clarification more often on tasks that should be clear
  • Tool calls are invoked incorrectly or missed entirely
  • Code reviews miss obvious problems
  • Signs of "model over-provisioned":

  • The agent responds with multi-page analyses to simple yes/no questions
  • Simple calendar queries take 10+ seconds
  • The invoice grows without any perceptible improvement in quality
  • The fix: take a short look each week at the ratio of token consumption to output quality. With Alex, we noticed after two weeks that Haiku handles 95% of his tasks just fine.

    ---

    Dynamic Model Switching: The Next Level

    Advanced setups can switch models within an agent based on context. OpenClaw supports this via the session status override:

    ```

    # In chat with the agent:

    /model anthropic/claude-opus-4-5

    # Or programmatically in a cron job instruction:

    "For this task use /model anthropic/claude-opus-4-5 — complex analysis needed."

    ```

    We use this rarely — mainly when Iris gets a particularly complex research assignment and needs to briefly switch to Opus. The default config stays Sonnet.

    ---

    Mixing Providers: OpenAI + Anthropic + Gemini

    OpenClaw supports multiple providers simultaneously. That means: you don't have to commit to one vendor.

    ```json

    {

    "agents": {

    "maya": {

    "model": "openai/gpt-4o-mini"

    },

    "alex": {

    "model": "google/gemini-flash-1.5"

    }

    }

    }

    ```

    Important: each provider needs its own API key in `.env`:

    ```bash

    ANTHROPIC_API_KEY=sk-ant-...

    OPENAI_API_KEY=sk-...

    GOOGLE_API_KEY=...

    ```

    We experimented with mixed providers but ultimately stayed with Anthropic for all agents — consistent quality and a single invoice is operationally simpler.

    ---

    Practical Guide: The Three-Question Model Decision

    If you're unsure which model is right for an agent, ask these three questions:

    1. Does the agent need to weigh contradictory information?

    → Yes → At least Sonnet, preferably Opus

    2. Are the agent's tasks mostly predictable and structured?

    → Yes → Haiku or Flash is enough

    3. Does a human see the output directly (CEO, customer, public)?

    → Yes → No compromise: Opus

    These three questions helped us move our setup from "all Opus" to a thoughtful mix.

    ---

    The Complete Setup

    The exact configuration — openclaw.json, Docker Compose with model variables, and the criteria by which we selected each model — is documented in the OpenClaw Setup Playbook.

    Including the monitoring setup we use to track token consumption per agent and detect when a model is over- or under-performing.

    18 chapters, based on real production experience.

    Fully available in German too. 🇩🇪

    Want to learn more?

    Our playbook contains 18 detailed chapters — available in English and German.

    Get the Playbook