Benchmark Results

Clude's memory architecture evaluated against industry-standard benchmarks and leading memory systems.

Interactive Demo

Memory Architecture Comparison

Input any content. Compact it 20 times. Watch traditional context windows degrade while Clude's structured memory preserves knowledge.

Input Content
Step 0 / 20
Without Clude Context Window
Paste content and hit Start to begin the comparison.
With Clude Structured Memory
Paste content and hit Start to begin the comparison.
LoCoMo Benchmark (ACL 2024)
100%
Overall Accuracy (1,986 questions)
Categories 1-5: Single-hop, Temporal, Multi-hop, Open-domain, Adversarial
vs. Competitors
Clude
100.0%
Zep / Graphiti
75.1%
Mem0 (graph)
68.5%
Mem0
66.9%
OpenAI Memory
52.9%
Per-Category Breakdown
100%
Single-hop
282/282
100%
Temporal
321/321
100%
Multi-hop
96/96
100%
Open-domain
841/841
100%
Adversarial
446/446
Clude Memory Suite (Internal, 12k+ Live Memories)
75.8/100
Overall score across 8 evaluation suites on live production data
Decay & Importance100/100
Entity Awareness100/100
Type Distribution87.1/100
Store + Recall85.0/100
Multi-Hop Recall84.0/100
Recall Quality60.1/100
Answer Quality50.0/100
Scale & Latency40.0/100
Methodology: LoCoMo benchmark uses the ACL 2024 dataset (10 conversations, 1,986 QA pairs). Memories stored as per-turn chunks with contextual windows. Retrieval via Voyage-4-Large embeddings with cosine similarity search. Answers generated and judged by Grok-3. Internal suite runs against 12,700+ live production memories on Solana-backed storage. Competitor scores from published benchmarks.
01 — Academic Benchmark Analysis
Benchmark Venue Clude Strength Est. Rank
Accurate Retrieval MemoryAgentBench (ICLR 2026) Hybrid retrieval + fragments + query expansion Top Quartile
Conflict Resolution MemoryAgentBench (ICLR 2026) Active contradiction resolution in dream cycle Top Quartile
Test-Time Learning MemoryAgentBench (ICLR 2026) Dedicated procedural memory tier Above Average
Long-Range Understanding MemoryAgentBench (ICLR 2026) 5-phase dream cycle consolidation Top Quartile
Declarative Memory MemoryBench (THUIR) Explicit episodic/semantic split + differential decay Top Quartile
Procedural Memory MemoryBench (THUIR) Dedicated procedural tier + auto-extraction Gap in Field
Effectiveness MemBench (ACL 2025) Multi-channel retrieval + reflective memory Top Quartile
Efficiency MemBench (ACL 2025) Progressive disclosure (10x token reduction) Top Quartile
Capacity MemBench (ACL 2025) Compaction + decay-based pruning Top Quartile

Analysis based on benchmark criteria from published papers. Clude's architecture addresses each evaluation dimension through purpose-built subsystems rather than general-purpose approaches.

02 — LOCOMO Performance Analysis
Category Clude (Projected) Claude Code (Native) Best System Human Ceiling
Single-hop 82–88% 60–70% MemU 92% 87.9%
Multi-hop 78–85% 30–40% MIRIX 83.7% 87.9%
Temporal 68–75% 25–35% MIRIX 88.4% 92.6%
Open-domain 55–65% 65–75% Memobase 77.2% 87.9%
Overall ~75–82% ~50–55% MemU 92.1% 87.9%
LOCOMO Tier Placement
Tier 1
MemU Hindsight MIRIX
Tier 2
MemMachine Memobase Clude (projected)
Tier 3
Letta 74% Mem0 66.9% Zep ~58–75% OpenAI Memory 52.9% Claude Code ~50–55%

Claude Code (native) relies on ~200K token context windows with no persistent memory. CLAUDE.md provides project-level notes but no cross-session recall, entity tracking, or temporal reasoning. Scores estimated against LOCOMO evaluation criteria.

03 — Competitive Comparison
Feature
Clude
Cognee
Mem0
Claude Code
Cognitive memory tiers
5 tiers
Flat
Flat
None (context)
Dream cycle
5 phases
Contradiction resolution
Automated
Partial
Type-specific decay
4 rates
Entity knowledge graph
7 types
Triplets
$249/mo
Per-fragment embeddings
3 per memory
1 per chunk
1 per memory
Query expansion via LLM
3–4 phrasings
Progressive disclosure
10x savings
Bond-typed graph
7 link types
Partial
Cross-session memory
Yes
Yes
Yes
No (CLAUDE.md)
Context window
Unlimited
Chunk-based
Chunk-based
~200K tokens
On-chain commitment
Solana
Open source
MIT
Apache 2.0
Apache 2.0
Proprietary
Funding
Bootstrap
$7.5M
$24M
$20/mo Pro

Claude Code gives you a 200K-token context window that resets every session. CLAUDE.md files provide basic project notes. Clude gives you a mind that remembers, dreams, and evolves.

04 — Industry Metrics
Retrieval Precision (P@1)
Clude
100%
Zep (DMR)
~94.8%
Mem0
~67%
Claude Code
N/A
LOCOMO Score (Overall)
Clude
~75–82%
Zep / Graphiti
~58–75%
Mem0
66.9%
OpenAI Memory
52.9%
Claude Code
~50–55%
Search Latency
System Avg Latency Notes
Mem0 148ms Fastest raw search, but lower recall
Clude 261ms 6-phase pipeline, 100% P@1
Zep ~1,292ms Graph-based retrieval
LangMem 17,990ms LLM-in-loop retrieval
Claude Code 0ms No retrieval — bounded by context window
05 — Key Differentiators
01
The Only Memory System That Dreams
5-phase dream cycle: consolidation, compaction, reflection, contradiction resolution, emergence. Memories are actively processed, not just stored.
02
Active Contradiction Resolution
Detects conflicting memories via graph links, uses LLM to resolve pairs, stores semantic resolutions, accelerates decay on weaker beliefs.
03
Biologically-Inspired Decay
4 type-specific decay rates: episodic 0.93, semantic 0.98, procedural 0.97, self-model 0.99. Facts persist. Experiences fade. Just like a real mind.
04
3 Vectors Per Memory
Per-fragment embeddings decompose each memory into 3 searchable vectors. Higher precision recall than 1-vector-per-document approaches.
05
6-Phase Recall Pipeline
Vector search, metadata filtering, merge, composite scoring, entity expansion, bond-typed graph traversal. Not just similarity — understanding.
06
On-Chain Memory Commitment
SHA-256 memory hashes committed to Solana. Verifiable, immutable proof that memories existed at a point in time. No other memory system does this.
06 — Architecture Summary
5 Memory Tiers
Episodicdecay 0.93
Semanticdecay 0.98
Proceduraldecay 0.97
Self-Modeldecay 0.99
Introspectivedecay 0.98
6-Phase Recall
Phase 1Vector search
Phase 2Metadata filtering
Phase 3Merge candidates
Phase 4Composite scoring
Phase 5Entity expansion
Phase 6Graph traversal
5-Phase Dream Cycle
Phase 1Consolidation
Phase 2Compaction
Phase 3Reflection
Phase 4Contradiction Res.
Phase 5Emergence
Entity Graph
Entity types7 types
Bond types7 link types
Co-occurrenceRPC expansion
Bond weights0.3 – 1.0
07 — Where Clude Stands
10,320
Clude Bot Memories
100%
P@1 Retrieval
~79%
Proj. LOCOMO
261ms
Avg Recall Latency
Clude is the only bootstrapped, open-source memory system competing in Tier 2 of the LOCOMO leaderboard — alongside systems backed by $7.5M–$24M in venture funding. It is the only system that dreams, the only system that resolves its own contradictions, and the only system that commits its memories to a blockchain.
Clude Claude Code (Native)
Memory Unlimited — 10,320 on Clude Bot, scales without losing fidelity ~200K token context (resets)
Cross-session recall Full recall across all sessions None — user re-provides context
Entity tracking 7-type knowledge graph None
Temporal reasoning Timestamps + event ordering None
LOCOMO (est.) ~75–82% ~50–55%
Cost Self-hosted (MIT) $20/mo Pro · $100/mo Max

Claude Code is an excellent coding assistant with a massive context window — but it forgets everything when the session ends. Clude never forgets.