Fixing Context Rot in OpenClaw Agents: Keeping Your Agent's Memory Sharp
What Context Rot Is and Why It Hits Everyone
A tweet from last night sums it up well:
> *"The reason it forgets tasks and hallucinates isn't actually a model intelligence issue, it's an architecture problem. By default, OpenClaw (like most agents right now) just appends everything into a giant context window or a flat vector database. Eventually, it hits 'context rot'"*
That's precise. And it happens slowly enough that most people only notice it after weeks.
The symptoms:
The cause: when every session interaction gets appended to a growing history, the model eventually loses the thread. LLMs have an "attention budget." The longer the context, the worse the precision on older entries — this is the well-documented "Lost in the Middle" effect.
OpenClaw's version 2026.3.7-beta.1 introduced a direct solution: the ContextEngine plugin interface.
---
Why the Default Setup Leads to Context Rot
In a fresh OpenClaw setup the architecture is simple: every message gets appended to the session history. That works fine for the first few weeks.
The problem is accumulation:
```
Session 1: 500 token history
Session 10: 5,000 token history
Session 100: 50,000 token history
```
Add to that MEMORY.md (loaded at every main session), daily notes, SOUL.md, USER.md — all of that gets pumped into the context window.
With Claude Opus and its 200K token context window, that sounds fine. But:
1. Cost: 50,000 token input × 48 heartbeats daily = 2.4 million tokens — in one day, just for context
2. Quality: LLMs perform worse with very long contexts. Information at the beginning gets effectively "overlooked"
3. Conflicts: MEMORY.md from today can contradict MEMORY.md from three months ago — the agent doesn't know which version is current
In short: a bloated context window is not a sign of memory. It's a sign of chaos.
---
The Solution: Three-Tier Memory
The cleanest solution comes from computer architecture, applied to agents. Just as an operating system separates RAM, cache, and disk, we separate agent memory into three tiers:
Tier 1: Core Memory (always in context)
Small, dense information available on every turn:
This tier must never exceed ~2,000 tokens. It's the agent's "RAM."
Tier 2: Recall Memory (searchable, not blindly injected)
All session history, daily notes, past cron results — but not automatically in context. Only when relevant:
```
# Instead of: loading all daily files into context
# Better: only on explicit need
memory_search("what did we decide about deployment last week?")
# → Returns only the relevant snippets, not everything
```
OpenClaw's `memory_search` tool does exactly this: semantic search over all memory files, returns top matches. That's ~500-1,000 tokens instead of 50,000.
Tier 3: Archival Memory (for deep search)
Older information that's rarely needed: decisions from 6 months ago, completed projects, expired cron configurations. The agent can access this when explicitly searching for it — but it's never auto-loaded.
In practice: anything older than 30 days gets moved from daily notes to a compressed archive file.
---
Concrete Implementation: What We Changed
Here's exactly what we did in our setup to fix context rot:
Step 1: Compress and Regularly Clean MEMORY.md
The biggest quick win: MEMORY.md had grown over months and contained stale information.
```bash
# How big is MEMORY.md right now?
wc -w ~/.openclaw/workspace/MEMORY.md
# → 8,432 words — way too much
# Goal: under 1,000 words
# Process: what's still current? What can go?
```
We introduced a monthly cron job:
```
Schedule: 0 9 1 * * (first day of month, 9 AM)
Prompt:
Run a MEMORY.md cleanup:
1. Read MEMORY.md completely
2. Check each entry: is it still relevant? Still current?
3. Remove entries that:
- Are older than 3 months and aren't permanent context
- Are about completed projects
- Have been superseded by newer entries
4. Condense related entries into compact summaries
5. Write a new, compressed MEMORY.md (target: under 800 words)
6. Archive the old version as memory/archive/YYYY-MM.md
```
After the first run: from 8,432 to 743 words. No relevant information lost — but 90% less token load.
Step 2: Load Daily Notes Selectively
The default behavior in AGENTS.md loads the last 2 days of notes:
```markdown
3. Read memory/YYYY-MM-DD.md (today + yesterday) for recent context
```
That's reasonable. But for agents with many daily entries (Sam: ~2,000 words per day), that quickly becomes 4,000 tokens just for the last two days.
Our solution: more structured daily notes with clear sections, so the agent only loads the relevant parts:
```markdown
# memory/2026-03-25.md
Active Tasks (ALWAYS READ)
Decisions Today (ONLY WHEN RELEVANT)
Detailed Logs (ONLY ON REQUEST)
[full details...]
```
The agent only loads the "Active Tasks" section automatically — the rest only when explicitly searching for it.
Step 3: Respect Session Boundaries
The underestimated problem: very long individual sessions accumulate more and more context. Every message becomes history that gets sent with the next turn.
For cron jobs: always use `sessionTarget: "isolated"`. Isolated sessions start without history overhead.
For main sessions (direct chats): explicitly start new sessions on large context switches:
```
# Instead of one eternal session
# When switching from "project planning" to "code review":
/restart # OpenClaw starts new session with fresh context
# (the important info is in MEMORY.md — that gets reloaded fresh)
```
This sounds counter-intuitive (losing context?), but the gain outweighs the cost: the new session reads MEMORY.md fresh and has full "attention focus."
Step 4: Configure ContextEngine (OpenClaw ≥ 2026.3.7)
With the ContextEngine update comes an explicit interface for these configurations. In `openclaw.json` or the agent-specific config:
```json
{
"contextEngine": {
"strategy": "tiered",
"coreMemoryMaxTokens": 2000,
"recallMemorySearchK": 5,
"archiveAfterDays": 30,
"compactSessionHistoryAfterTurns": 20,
"hooks": {
"onIngest": "scoped",
"onAssemble": "progressive",
"onCompact": "summarize"
}
}
}
```
What the hooks mean:
Important: The ContextEngine interface is available from version 2026.3.7-beta.1. Check your version with `openclaw --version`.
---
What Changed After the Switch
We rolled out these changes across our 6-agent setup two weeks ago. Measurable results:
Token consumption: -58% (from ~180,000 tokens daily to ~76,000 — at the same task load)
Response quality: Significantly more precise on questions about information more than 3 days old. Before: vague or wrong. After: correct and citing the memory files.
Hallucination rate: Dropped sharply. Mainly because the agent now rarely "guesses" — instead it actively searches memory files first.
Costs: -58% on input tokens means at our model mix (mainly Sonnet) a monthly saving of ~€85. Directly measurable in the Anthropic Console.
---
The "HyperStack" Community Solution
In the Reddit community r/ClaudeCode, another approach has been going viral: "HyperStack" — a community project specifically for OpenClaw.
Instead of dumping the full conversation history, HyperStack stores knowledge in structured "cards" — similar to index cards:
```json
{
"cards": [
{
"id": "arch-001",
"topic": "GitHub Workflow",
"content": "Never push directly to main. Branch from dev, PRs target dev.",
"lastUpdated": "2026-03-10",
"relevanceScore": 0.94
},
{
"id": "pref-003",
"topic": "Dimitrios Preferences",
"content": "Prefers short updates. No long explanations unless asked.",
"lastUpdated": "2026-02-28",
"relevanceScore": 0.87
}
]
}
```
When assembling context, only the top-N cards by relevance are included — hybrid search (semantic + keyword).
This is a valid approach, but more involved to set up than the more native OpenClaw solution via memory_search + tiered structure. For teams accumulating very large amounts of knowledge (>10,000 facts), HyperStack may be the better choice.
---
Quick Check: How Severe Is Your Context Rot?
Here's a simple test to understand how bad the problem is for you:
```bash
# 1. How large is your MEMORY.md?
wc -w ~/.openclaw/workspace/MEMORY.md
# > 2,000 words: clean it urgently
# 500-2,000 words: acceptable range
# < 500 words: good
# 2. How many daily note files exist?
ls memory/20*.md | wc -l
# > 60 files without archiving: too many
# Everything over 30 days should be archived
# 3. How large are the daily notes on average?
wc -w memory/2026-03-*.md | tail -1
# > 1,000 words/day: too much detail in notes
# 300-600 words/day: reasonable
# 4. Test: ask the agent about something it did 2 weeks ago
# Does it give a precise answer with concrete details?
# → Yes: no acute problem
# → Vague or wrong: context rot is active
```
---
The Principle Behind Everything: Scoped Memory Injection
The core principle connecting all these measures:
Don't blindly inject everything into every prompt. Inject only what's relevant for the current turn.
This sounds simple. It is simple. But it requires actively overriding the default configuration — because "always load everything" is the safe default (you never lose anything), but not the good one.
The three-tier structure — core always, recall on-demand, archive on-demand — is the practical implementation of this principle.
If you don't have this in your setup yet: start with MEMORY.md. Reduce it to under 800 words. The difference will be immediately noticeable.
---
The complete setup — ContextEngine configuration, MEMORY.md cleanup as a cron job, the three-tier memory architecture for all 6 agents, and the exact prompts for monthly cleanup — is documented in the OpenClaw Setup Playbook.
18 chapters, based on real production experience.
Fully available in German too. 🇩🇪
Want to learn more?
Our playbook contains 18 detailed chapters — available in English and German.
Get the Playbook