2026-03-259 min

Fixing Context Rot in OpenClaw Agents: Keeping Your Agent's Memory Sharp

MemoryContextContextEnginePerformanceOpenClaw

What Context Rot Is and Why It Hits Everyone

A tweet from last night sums it up well:

> *"The reason it forgets tasks and hallucinates isn't actually a model intelligence issue, it's an architecture problem. By default, OpenClaw (like most agents right now) just appends everything into a giant context window or a flat vector database. Eventually, it hits 'context rot'"*

That's precise. And it happens slowly enough that most people only notice it after weeks.

The symptoms:

The agent gives increasingly vague answers to questions it used to answer precisely

It "forgets" preferences or conventions that are written in MEMORY.md

Tool calls are more frequently wrong or get skipped entirely

Response times increase (the context window gets heavier to process)

Hallucinations increase — the agent invents facts instead of admitting it doesn't know

The cause: when every session interaction gets appended to a growing history, the model eventually loses the thread. LLMs have an "attention budget." The longer the context, the worse the precision on older entries — this is the well-documented "Lost in the Middle" effect.

OpenClaw's version 2026.3.7-beta.1 introduced a direct solution: the ContextEngine plugin interface.

---

Why the Default Setup Leads to Context Rot

In a fresh OpenClaw setup the architecture is simple: every message gets appended to the session history. That works fine for the first few weeks.

The problem is accumulation:

```

Session 1: 500 token history

Session 10: 5,000 token history

Session 100: 50,000 token history

```

Add to that MEMORY.md (loaded at every main session), daily notes, SOUL.md, USER.md — all of that gets pumped into the context window.

With Claude Opus and its 200K token context window, that sounds fine. But:

1. Cost: 50,000 token input × 48 heartbeats daily = 2.4 million tokens — in one day, just for context

2. Quality: LLMs perform worse with very long contexts. Information at the beginning gets effectively "overlooked"

3. Conflicts: MEMORY.md from today can contradict MEMORY.md from three months ago — the agent doesn't know which version is current

In short: a bloated context window is not a sign of memory. It's a sign of chaos.

---

The Solution: Three-Tier Memory

The cleanest solution comes from computer architecture, applied to agents. Just as an operating system separates RAM, cache, and disk, we separate agent memory into three tiers:

Tier 1: Core Memory (always in context)

Small, dense information available on every turn:

SOUL.md — personality and core rules (200-400 words)

USER.md — who the agent helps (under 500 words)

Current task / active projects (from HEARTBEAT.md, first 10 items only)

This tier must never exceed ~2,000 tokens. It's the agent's "RAM."

Tier 2: Recall Memory (searchable, not blindly injected)

All session history, daily notes, past cron results — but not automatically in context. Only when relevant:

```

# Instead of: loading all daily files into context

# Better: only on explicit need

memory_search("what did we decide about deployment last week?")

# → Returns only the relevant snippets, not everything

```

OpenClaw's `memory_search` tool does exactly this: semantic search over all memory files, returns top matches. That's ~500-1,000 tokens instead of 50,000.

Tier 3: Archival Memory (for deep search)

Older information that's rarely needed: decisions from 6 months ago, completed projects, expired cron configurations. The agent can access this when explicitly searching for it — but it's never auto-loaded.

In practice: anything older than 30 days gets moved from daily notes to a compressed archive file.

---

Concrete Implementation: What We Changed

Here's exactly what we did in our setup to fix context rot:

Step 1: Compress and Regularly Clean MEMORY.md

The biggest quick win: MEMORY.md had grown over months and contained stale information.

```bash

# How big is MEMORY.md right now?

wc -w ~/.openclaw/workspace/MEMORY.md

# → 8,432 words — way too much

# Goal: under 1,000 words

# Process: what's still current? What can go?

```

We introduced a monthly cron job:

```

Schedule: 0 9 1 * * (first day of month, 9 AM)

Prompt:

Run a MEMORY.md cleanup:

1. Read MEMORY.md completely

2. Check each entry: is it still relevant? Still current?

3. Remove entries that:

- Are older than 3 months and aren't permanent context

- Are about completed projects

- Have been superseded by newer entries

4. Condense related entries into compact summaries

5. Write a new, compressed MEMORY.md (target: under 800 words)

6. Archive the old version as memory/archive/YYYY-MM.md

```

After the first run: from 8,432 to 743 words. No relevant information lost — but 90% less token load.

Step 2: Load Daily Notes Selectively

The default behavior in AGENTS.md loads the last 2 days of notes:

```markdown

3. Read memory/YYYY-MM-DD.md (today + yesterday) for recent context

```

That's reasonable. But for agents with many daily entries (Sam: ~2,000 words per day), that quickly becomes 4,000 tokens just for the last two days.

Our solution: more structured daily notes with clear sections, so the agent only loads the relevant parts:

```markdown

# memory/2026-03-25.md

Active Tasks (ALWAYS READ)

PR #247 waiting for review

Dimitrios needs monthly report by Friday

Decisions Today (ONLY WHEN RELEVANT)

sam/fix-auth-flow merged (14:23)

ClickUp task TC-89 set to "In Review"

Detailed Logs (ONLY ON REQUEST)

[full details...]

```

The agent only loads the "Active Tasks" section automatically — the rest only when explicitly searching for it.

Step 3: Respect Session Boundaries

The underestimated problem: very long individual sessions accumulate more and more context. Every message becomes history that gets sent with the next turn.

For cron jobs: always use `sessionTarget: "isolated"`. Isolated sessions start without history overhead.

For main sessions (direct chats): explicitly start new sessions on large context switches:

```

# Instead of one eternal session

# When switching from "project planning" to "code review":

/restart # OpenClaw starts new session with fresh context

# (the important info is in MEMORY.md — that gets reloaded fresh)

```

This sounds counter-intuitive (losing context?), but the gain outweighs the cost: the new session reads MEMORY.md fresh and has full "attention focus."

Step 4: Configure ContextEngine (OpenClaw ≥ 2026.3.7)

With the ContextEngine update comes an explicit interface for these configurations. In `openclaw.json` or the agent-specific config:

```json

{

"contextEngine": {

"strategy": "tiered",

"coreMemoryMaxTokens": 2000,

"recallMemorySearchK": 5,

"archiveAfterDays": 30,

"compactSessionHistoryAfterTurns": 20,

"hooks": {

"onIngest": "scoped",

"onAssemble": "progressive",

"onCompact": "summarize"

}

```

What the hooks mean:

`onIngest: "scoped"`: New information is written only to relevant memory tiers, not blindly to everything

`onAssemble: "progressive"`: Context is assembled progressively — core memory first, then recall only if needed

`onCompact: "summarize"`: When session history gets too long, it's summarized rather than truncated

Important: The ContextEngine interface is available from version 2026.3.7-beta.1. Check your version with `openclaw --version`.

---

What Changed After the Switch

We rolled out these changes across our 6-agent setup two weeks ago. Measurable results:

Token consumption: -58% (from ~180,000 tokens daily to ~76,000 — at the same task load)

Response quality: Significantly more precise on questions about information more than 3 days old. Before: vague or wrong. After: correct and citing the memory files.

Hallucination rate: Dropped sharply. Mainly because the agent now rarely "guesses" — instead it actively searches memory files first.

Costs: -58% on input tokens means at our model mix (mainly Sonnet) a monthly saving of ~€85. Directly measurable in the Anthropic Console.

---

The "HyperStack" Community Solution

In the Reddit community r/ClaudeCode, another approach has been going viral: "HyperStack" — a community project specifically for OpenClaw.

Instead of dumping the full conversation history, HyperStack stores knowledge in structured "cards" — similar to index cards:

```json

{

"cards": [

{

"id": "arch-001",

"topic": "GitHub Workflow",

"content": "Never push directly to main. Branch from dev, PRs target dev.",

"lastUpdated": "2026-03-10",

"relevanceScore": 0.94

{

"id": "pref-003",

"topic": "Dimitrios Preferences",

"content": "Prefers short updates. No long explanations unless asked.",

"lastUpdated": "2026-02-28",

"relevanceScore": 0.87

}

]

}

```

When assembling context, only the top-N cards by relevance are included — hybrid search (semantic + keyword).

This is a valid approach, but more involved to set up than the more native OpenClaw solution via memory_search + tiered structure. For teams accumulating very large amounts of knowledge (>10,000 facts), HyperStack may be the better choice.

---

Quick Check: How Severe Is Your Context Rot?

Here's a simple test to understand how bad the problem is for you:

```bash

# 1. How large is your MEMORY.md?

wc -w ~/.openclaw/workspace/MEMORY.md

# > 2,000 words: clean it urgently

# 500-2,000 words: acceptable range

# < 500 words: good

# 2. How many daily note files exist?

ls memory/20*.md | wc -l

# > 60 files without archiving: too many

# Everything over 30 days should be archived

# 3. How large are the daily notes on average?

wc -w memory/2026-03-*.md | tail -1

# > 1,000 words/day: too much detail in notes

# 300-600 words/day: reasonable

# 4. Test: ask the agent about something it did 2 weeks ago

# Does it give a precise answer with concrete details?

# → Yes: no acute problem

# → Vague or wrong: context rot is active

```

---

The Principle Behind Everything: Scoped Memory Injection

The core principle connecting all these measures:

Don't blindly inject everything into every prompt. Inject only what's relevant for the current turn.

This sounds simple. It is simple. But it requires actively overriding the default configuration — because "always load everything" is the safe default (you never lose anything), but not the good one.

The three-tier structure — core always, recall on-demand, archive on-demand — is the practical implementation of this principle.

If you don't have this in your setup yet: start with MEMORY.md. Reduce it to under 800 words. The difference will be immediately noticeable.

---

The complete setup — ContextEngine configuration, MEMORY.md cleanup as a cron job, the three-tier memory architecture for all 6 agents, and the exact prompts for monthly cleanup — is documented in the OpenClaw Setup Playbook.

18 chapters, based on real production experience.

Fully available in German too. 🇩🇪

Want to learn more?

Our playbook contains 18 detailed chapters — available in English and German.

Get the Playbook