Which AI Model for Which Agent? How to Choose Smartly
The Problem With "Best Model for Everything"
When we built our first multi-agent setup, the configuration was simple: all agents on Claude Opus. The strongest, the most expensive — better safe than sorry.
Then the API invoice came after week one. For six agents running around the clock, the number was... significantly higher than expected.
The realization: Not every task needs the most powerful model. When Alex checks my calendar and tells me whether I have a meeting tomorrow — that doesn't need 200-token-per-second intelligence. But Peter, our coding agent, reviewing complex TypeScript architectures — he genuinely needs the best available.
The result of our overhaul: 60% fewer API costs with equal or better quality.
---
The Core Idea: Classify Your Tasks
Before assigning models, you need to understand what each agent actually does. We split our agents into three categories:
Category 1: Reasoning-Intensive Tasks
These tasks require deep thinking, multi-step inference, code analysis, or creative quality work.
Examples:
Recommended models: Claude Opus 4.5+, GPT-4o, Gemini Ultra
Category 2: Structured, Rule-Based Tasks
These tasks follow clear patterns. Input is structured, output is predictable, error rates are low.
Examples:
Recommended models: Claude Sonnet 4.5, GPT-4o Mini, Gemini Flash
Category 3: Simple Execution
Tasks where the model mostly acts as an interface: receive command, call tool, return result.
Examples:
Recommended models: Claude Haiku, GPT-4o Mini, Gemini Flash 8B
---
How to Configure Models in OpenClaw
OpenClaw lets you set the model per agent in `~/.openclaw/openclaw.json` (or `openclaw.json5`). The configuration looks like this:
```json
{
"agents": {
"sam": {
"model": "anthropic/claude-opus-4-5",
"workspace": "/home/sam/.openclaw/workspace"
},
"peter": {
"model": "anthropic/claude-opus-4-5",
"workspace": "/home/peter/.openclaw/workspace"
},
"maya": {
"model": "anthropic/claude-sonnet-4-5",
"workspace": "/home/maya/.openclaw/workspace"
},
"alex": {
"model": "anthropic/claude-haiku-3-5",
"workspace": "/home/alex/.openclaw/workspace"
},
"iris": {
"model": "anthropic/claude-sonnet-4-5",
"workspace": "/home/iris/.openclaw/workspace"
},
"atlas": {
"model": "anthropic/claude-opus-4-5",
"workspace": "/home/atlas/.openclaw/workspace"
}
}
}
```
Alternatively, you can set the model per agent via an environment variable in the Docker Compose file:
```yaml
services:
alex:
image: openclaw/agent:latest
environment:
- OPENCLAW_MODEL=anthropic/claude-haiku-3-5
- OPENCLAW_AGENT_NAME=alex
volumes:
- /home/alex/.openclaw/workspace:/workspace
```
Both methods work. We prefer the JSON configuration because it documents all agents centrally.
---
Our Real Setup: The 6-Agent Model Matrix
Here's exactly what we run — no theory, the actual setup:
| Agent | Role | Model | Reason |
|-------|------|-------|--------|
| Sam (Team Lead) | Delegation, planning, blog | Claude Opus 4.5 | Complex coordination, creative writing |
| Peter (Coding) | PR reviews, tests, bugs | Claude Opus 4.5 | Architecture understanding, reasoning |
| Maya (Marketing) | Copy, campaigns, SEO | Claude Sonnet 4.5 | Good quality, 3× cheaper than Opus |
| Alex (Admin) | Calendar, reminders, tasks | Claude Haiku 3.5 | Structured tasks, no depth needed |
| Iris (Research) | Research, synthesis, reports | Claude Sonnet 4.5 | Reasoning + cost balance |
| Atlas (CEO Support) | Strategy, reports, letters | Claude Opus 4.5 | CEO output must be flawless |
Cost comparison (estimated, moderate usage):
---
How to Tell When an Agent Has the Wrong Model
Signs of "model too weak":
Signs of "model over-provisioned":
The fix: take a short look each week at the ratio of token consumption to output quality. With Alex, we noticed after two weeks that Haiku handles 95% of his tasks just fine.
---
Dynamic Model Switching: The Next Level
Advanced setups can switch models within an agent based on context. OpenClaw supports this via the session status override:
```
# In chat with the agent:
/model anthropic/claude-opus-4-5
# Or programmatically in a cron job instruction:
"For this task use /model anthropic/claude-opus-4-5 — complex analysis needed."
```
We use this rarely — mainly when Iris gets a particularly complex research assignment and needs to briefly switch to Opus. The default config stays Sonnet.
---
Mixing Providers: OpenAI + Anthropic + Gemini
OpenClaw supports multiple providers simultaneously. That means: you don't have to commit to one vendor.
```json
{
"agents": {
"maya": {
"model": "openai/gpt-4o-mini"
},
"alex": {
"model": "google/gemini-flash-1.5"
}
}
}
```
Important: each provider needs its own API key in `.env`:
```bash
ANTHROPIC_API_KEY=sk-ant-...
OPENAI_API_KEY=sk-...
GOOGLE_API_KEY=...
```
We experimented with mixed providers but ultimately stayed with Anthropic for all agents — consistent quality and a single invoice is operationally simpler.
---
Practical Guide: The Three-Question Model Decision
If you're unsure which model is right for an agent, ask these three questions:
1. Does the agent need to weigh contradictory information?
→ Yes → At least Sonnet, preferably Opus
2. Are the agent's tasks mostly predictable and structured?
→ Yes → Haiku or Flash is enough
3. Does a human see the output directly (CEO, customer, public)?
→ Yes → No compromise: Opus
These three questions helped us move our setup from "all Opus" to a thoughtful mix.
---
The Complete Setup
The exact configuration — openclaw.json, Docker Compose with model variables, and the criteria by which we selected each model — is documented in the OpenClaw Setup Playbook.
Including the monitoring setup we use to track token consumption per agent and detect when a model is over- or under-performing.
18 chapters, based on real production experience.
Fully available in German too. 🇩🇪
Want to learn more?
Our playbook contains 18 detailed chapters — available in English and German.
Get the Playbook