v0.2.3 · live

mneme

Claude never starts cold.
Claude never loses its place.

The persistent memory layer for AI coding. Survives context compaction. Indexes your code. Injects the right 1–3K tokens every turn. 100% local.

Install in 60 s → View on GitHub

47 MCP tools 18 AI platforms 0 network calls

~/projects/my-app · mneme

$ mneme recall "auth flow"
→ 3 hits from graph (resumption-ready)
auth/login.ts — email+password, OAuth, JWT rotation 0.94
auth/middleware.ts — bearer check, role gate 0.88
docs/AUTH.md — threat model, 14 step ledger entries 0.81
  injected 2.3K tokens · 0 ms network · 1.8 ms SQLite
$

1000× faster incremental reindex vs CRG

47 MCP tools · 14 views · 27 languages

0 network calls · 100% local · measured v0.2.3

The three flaws every AI assistant has

Mneme fixes all three at the architecture level — not the prompt level.

Starts cold every time

Claude re-reads the same files, asks the same questions, rediscovers your codebase on every task. Mneme's graph means it already knows.

Forgets mid-task

Context compaction wipes the conversation. Without mneme, Claude restarts from a guess. With mneme, it resumes at the exact verified step.

Drifts from your rules

CLAUDE.md says "no hardcoded colors" — five prompts later it hardcodes one. Mneme's drift detector enforces every rule in real time.

The killer feature — compaction resilience

You give Claude a 100-step task. Context compacts at step 50.
Without mneme: Claude restarts from 30 or re-reads every doc.
With mneme: Claude resumes at step 51. Verified. No re-reading.

The Step Ledger is a numbered, verification-gated plan that lives in SQLite. Every step records its acceptance check. When compaction wipes Claude's working memory, the next turn auto-injects a ~5K-token resumption bundle with the verbatim goal, completed steps, current position, and remaining work. No other MCP does this.

Benchmarks vs code-review-graph

Measured against the current state-of-the-art code-graph MCP.

Metric	CRG (SoTA)	mneme	Improvement
Token reduction — code review	6.8×	1.338× mean / 3.542× p95	measured v0.2.3
First build (cold)	10 s (500 files)	4,970 ms on 359 files	measured v0.2.3
Incremental update	<2 s	p95 = 0 ms, max = 2 ms	measured v0.2.3
Visualization ceiling	~5 000 nodes	100 000+ (design, not yet benchmarked)	design target
MCP tools	24	47	+23
Visualization views	1 (D3)	14 (WebGL)	14×
Platforms supported	10	18	+8
Compaction survival	❌	✅	category-defining
Multimodal (PDF / audio / video)	❌	✅	—

With tree-sitter alone, vs. with mneme

Tree-sitter is a parser library, not a code-graph or MCP server. This table is what you get out of the box with raw tree-sitter against what mneme gives you by wiring tree-sitter into a persistent SQLite graph with 47 MCP tools.

Surface	Raw tree-sitter	mneme (tree-sitter + SQLite + MCP)	Notes
Languages available	150+ community grammars (any you build yourself)	27 Language enum variants, 25 grammar crates bundled by default (TS, TSX, JS, JSX, Python, Rust, Go, Java, C, C++, C#, Ruby, PHP, Bash, JSON, Lua, TOML, YAML, Markdown, Swift, Kotlin, Scala, Svelte, Solidity, Julia, Zig, Haskell; Vue variant reserved)	tree-sitter has more raw grammars; mneme bundles the ones with working query patterns
Code-graph queries shipped	- (you write .scm files by hand)	7 canonical query kinds per language: Functions, Classes, Calls, Imports, Decorators, Comments, Errors (parsers/src/query_cache.rs)	mneme compiles the queries once, caches in a DashMap, reuses across every file
MCP tools	-	47	tree-sitter has no MCP surface; mneme wires every query through a typed MCP tool
Incremental parsing	Yes (edit-tree API, but no caching layer)	*LRU-cached previous trees per file + tokio worker pool (cpu_count 4)**	mneme amortises the edit-tree cost; raw tree-sitter leaves it to the caller
Persistent store	-	22 sharded SQLite DBs (graph.db, history.db, corpus.db, decisions.db, ...) + global meta.db	tree-sitter is stateless; mneme persists every node and edge
Incremental update latency	-	p50 = 0 ms, p95 = 0 ms, max = 2 ms on 100 samples (single-file inject)	measured v0.2.3 on the mneme repo itself; raw tree-sitter has no persist step to measure
Semantic embeddings	-	pure-Rust hashing-trick default, opt-in bge-small from local path	100% local; no tree-sitter equivalent
Scanners built on top	-	11 (theme, types, security, a11y, perf, drift, ipc, md-drift, secrets, refactor, architecture)	each scanner is a tree-sitter query + SQLite write
Compaction-survival Step Ledger	-	numbered, verification-gated, SQLite-persisted, resumption bundle ~5K tokens	architectural, not a parser concern

How to reproduce

Every mneme number above is from benchmarks/ in this repo or directly measurable in source. To get your own numbers:

## 1. Clone and build
git clone https://github.com/omanishay-cyber/mneme && cd mneme
cargo build --release -p benchmarks --bin bench_retrieval

## 2. Index the repo you want to measure
cargo run --release -p cli --bin mneme -- build /path/to/repo

## 3. Run the full benchmark suite on that repo
./target/release/bench_retrieval.exe bench-all /path/to/repo \
    > benchmarks/results/$(date -I).csv \
    2> benchmarks/results/$(date -I).stderr

## 4. Individual benches (cold, warm, incremental, viz-scale, recall, compare)
./target/release/bench_retrieval.exe bench-first-build  /path/to/repo --format json
./target/release/bench_retrieval.exe bench-incremental  /path/to/repo --format json
./target/release/bench_retrieval.exe bench-recall       /path/to/repo benchmarks/fixtures/golden.json --format json
./target/release/bench_retrieval.exe bench-token-reduction /path/to/repo --format json

## 5. CRG head-to-head (requires Python 3.10+)
cd benchmarks/crg-compare
python -m venv .venv
.venv/Scripts/python.exe -m pip install code-review-graph
python run_crg_bench.py

Raw results from mneme's own run: benchmarks/results/2026-04-23.csv + benchmarks/results/crg-2026-04-23.json. Full methodology and caveats in BENCHMARKS.md.

Works with everything

One mneme install configures every AI tool it detects.

Claude Code Codex Cursor Windsurf Zed Continue OpenCode Google Antigravity Gemini CLI Aider GitHub Copilot CLI / VS Code Factory Droid Trae / Trae-CN Kiro Qoder OpenClaw Hermes Qwen Code

20 expert skills + 4 workflow codewords

Mneme ships a full skill arsenal. Claude auto-loads the right expert for the task. Four single-word verbs switch how Claude engages.

Codewords - one-word session control

Word	What it does
`coldstart`	Pause. Observe only. Read context, draft a plan, do not touch code.
`hotstart`	Resume with discipline. Numbered roadmap. `step_verify` after each step.
`firestart`	Maximum loadout. Load all fireworks skills + prime mneme graph + hotstart.
`CHS`	"Check my screenshot" - read the latest file in your Screenshots folder.

Fireworks skills - auto-dispatched by keyword

Each skill is a full package: SKILL.md with triggers + protocol, plus a references/ folder of deep how-to docs. Skills sleep until a matching keyword appears in your message - a Rust task never fires the React skill.

architect charts config debug design devops estimation flutter go patterns performance python react refactor rust research review security taskmaster test vscode workflow mneme-codewords

Type firestart on your next complex task. Mneme loads the full arsenal, primes your code graph, builds a numbered ledger, and works through it with verification gates - one step at a time.

Architecture — multi-process, fault-tolerant, 100% local

Marketplace plugin (global / user / project scope)
└─ SUPERVISOR (Rust, Windows service / launchd / systemd)
   ├─ STORE           22-layer SQLite sharded per project + WAL
   ├─ MCP server      Bun TS, 47 tools, JSON-RPC stdio
   ├─ PARSERS         Tree-sitter, 27 languages, num_cpus×4
   ├─ SCANNERS        Theme / security / a11y / perf / drift
   ├─ MD-INGEST       Drinks every .md like CLAUDE.md
   ├─ BRAIN           Pure-Rust embeddings + Leiden clustering
   ├─ MULTIMODAL      Python sidecar — PDF / Whisper / OCR
   ├─ LIVE BUS        SSE/WebSocket push channel
   ├─ VISION          14-view WebGL desktop+web app
   └─ HEALTH          60 s self-test, SLA dashboard