OpenClaw Orchestrator Pattern: How to Save 80% of Your Tokens Using Opus + Sonnet Sub-Agents
The Tweet That Started This
Someone posted this earlier today and it hit 98 impressions in under 10 minutes — which for an OpenClaw thread is basically viral:
> *"The biggest token saver nobody talks about: offload heavy tasks to sub-agents instead of keeping everything in one conversation. I run OpenClaw with Opus as orchestrator and Sonnet sub-agents for coding — main context stays tiny while sub-agents burn through tokens in isolated [sessions]"*
This is exactly right, and it's something the OpenClaw docs under-explain. Let's fix that.
---
The Problem: Context Rot Is Expensive
If you've been running OpenClaw for a while, you've probably hit this pattern:
1. You ask your agent to do something complex
2. The agent works through it in your main chat session
3. The conversation gets longer and longer
4. Responses get slower and more expensive
5. Eventually the agent starts forgetting things from 30 messages ago
This is context rot — and it costs you money on every single request because every message in the thread gets re-sent to the model with every new request.
A 50-message conversation with code snippets can easily hit 100k tokens *per response*. At Opus pricing, that adds up fast.
---
The Solution: Orchestrator + Sub-Agent Architecture
The fix is conceptually simple:
Your main context never accumulates the messy intermediate steps. It only sees the clean output.
Here's what this looks like in practice:
```
Main session (Opus):
"Hey, refactor the auth module to use JWT"
→ spawns Sonnet sub-agent with full codebase context
→ sub-agent works in isolation (5000 tokens of back-and-forth)
→ returns: "Done. Changed 3 files, here's a summary."
→ main session adds 200 tokens, not 5000
```
---
Setting It Up in OpenClaw
OpenClaw has first-class support for this via `sessions_spawn`. Here's the basic pattern in your SOUL.md or agent instructions:
```
When a task is complex or will require many steps:
1. Summarize the goal clearly
2. Use sessions_spawn to create an isolated sub-agent
3. Pass the goal + necessary context as the task
4. Wait for the result, then summarize it back to the user
```
The key insight: you control what context the sub-agent gets. You don't dump your entire conversation history into it. You write a clean, focused brief.
Example: Coding Sub-Agent
In your main agent's AGENTS.md or instructions:
```
For coding tasks that require more than 3 file edits:
```
This is exactly what the tweet described — Opus stays as the brain, Sonnet does the hands-on work.
---
Model Selection by Task Type
Not every task needs the same model. OpenClaw lets you specify the model per sub-agent spawn. Here's a practical breakdown:
| Task | Recommended Model | Why |
|------|-----------------|-----|
| Planning / decisions | Opus 4.x | Needs deep reasoning |
| Code writing | Sonnet 4.x | Fast, cheap, capable |
| Simple lookups | Haiku / mini | Near-free, fast |
| Long document analysis | Sonnet 4.x | Good context handling |
| Creative writing | Sonnet 4.x | Solid quality, good cost |
The tweet mentioned a $50/month setup: Codex mini for main brain, MiniMax for daily execution, Opus for feature planning. That's the orchestrator pattern applied to cost optimization.
---
Keeping Track: What Your Orchestrator Stores
The main session should only store decisions and outcomes, not process:
Store in main context (or MEMORY.md):
Do NOT store in main context:
This discipline is what keeps your orchestrator context lean over time.
---
The Token Math
Let's make this concrete. Say you have a coding task that involves:
Without sub-agents:
All 20,500 tokens accumulate in your main session. Every future message costs those 20,500 tokens plus whatever comes next.
With sub-agents:
The 20,000 tokens of work happen in isolation. Your main session only gains the 500-token summary.
Over 10 such tasks, the difference is:
At Opus pricing (~$15/million input tokens), that's a difference of roughly $2.93 per response after 10 tasks. If your agent handles 50 requests/day, that's $146/day vs $7.50/day.
The orchestrator pattern doesn't just feel cleaner — it is dramatically cheaper.
---
Practical SOUL.md Additions
Add these guidelines to your SOUL.md or agent instructions to make this automatic:
```
Task Delegation Rules
```
---
Common Mistakes
Mistake 1: Passing your entire conversation to the sub-agent
You negate all the benefits. Write a fresh brief. The sub-agent doesn't need to know about your earlier chat about something else.
Mistake 2: Using Opus for sub-agents
Sonnet handles the vast majority of coding and execution tasks perfectly well. Reserve Opus for planning, complex reasoning, and decisions that need deep thinking.
Mistake 3: Not summarizing sub-agent output
If a sub-agent returns 3,000 tokens of output and you dump it all into main context, you've only half-solved the problem. Ask the sub-agent to summarize, or summarize it yourself before storing.
Mistake 4: Spawning sub-agents for trivial tasks
Spawning has overhead — session creation, context loading, etc. For a quick "what's 2+2" style task, just answer in main session. Sub-agents are for heavy lifting.
---
The Cost Setup That Works
Going back to the original tweet, here's a proven $50/month setup:
The key: Opus is not your always-on model. It's spawned only when genuinely needed, runs in isolation, and returns a clean result. Your $20 Opus budget goes much further when it's not burning tokens on every heartbeat.
---
Quick Checklist
1. ✅ Identify your top 3 most expensive recurring tasks
2. ✅ Write a sub-agent brief template for each
3. ✅ Add delegation rules to your SOUL.md or agent instructions
4. ✅ Set Sonnet (or cheaper) as default for sub-agent spawns
5. ✅ Reserve Opus for orchestration and complex decisions only
6. ✅ After each sub-agent run, store only the outcome in main memory
Your main context should stay under 20,000 tokens for normal daily use. If it's consistently hitting 80,000+, you need sub-agents.
Everything covered here works with a standard OpenClaw setup — no plugins, no extra dependencies.
Vollständige Einrichtung im OpenClaw Setup Playbook dokumentiert. 🇩🇪
Want to learn more?
Our playbook contains 18 detailed chapters — available in English and German.
Get the Playbook