Tools
Cross-Agent Memory Is Here: Run ai-memory for Persistent Context Between Claude Code, Codex, and Cursor
ai-memory gives AI coding agents a shared, persistent wiki — quit Claude Code mid-task, start Codex hours later, and continue without re-explaining. Here is how to set it up on your own server.

Cross-Agent Memory Is Here: Run ai-memory for Persistent Context Between Claude Code, Codex, and Cursor
One of the biggest frustrations with AI coding agents is their **complete amnesia between sessions**. Quit Claude Code at 4 PM with a half-finished refactor, open OpenAI Codex the next morning, and you are back to square one — explaining the architecture, the failed approaches, the open questions, and your naming conventions from scratch.
**ai-memory** aims to solve this. It is an open-source (MIT), Rust-based server that gives AI coding agents a shared, persistent wiki. Every prompt, tool call, and decision is captured automatically; when a session ends, the relevant pages get rewritten as a coherent narrative; when the next agent starts — whether it is Claude Code, Codex, OpenCode, Cursor, Gemini CLI, or any other MCP-compatible agent — it sees a "where you left off" handoff prepended before its first prompt.
The project has been trending on GitHub this week with over 400 stars, and it is easy to see why: it solves a problem every coding agent user hits within the first few hours.
How ai-memory works
The architecture is deliberately simple:
- A **single Rust server** runs on your machine (or homelab) and exposes an MCP endpoint and an HTTP API.
- **Lifecycle hooks** on your coding agent fire-and-forget every prompt and tool call to the server during the session.
- At session end (or at configurable intervals), the server **compiles observations** into markdown wiki pages.
- On next session start, a **handoff page** is prepended to the agent's context with where you left off, open questions, and next steps.
The wiki itself is **plain markdown in a git repo** — grep-able, openable in Obsidian, backed up with rsync. No vector database to babysit, no `write_note` ceremony, no manual context-loading.
Key features
- **Zero-friction capture.** Lifecycle hooks fire-and-forget every prompt, tool call, and session boundary. You never need to remember to save context.
- **Cross-agent handoffs.** Quit Claude Code mid-task, start Codex in the same directory hours later — the next agent sees a handoff before its first prompt.
- **Per-project isolation.** Each project gets its own wiki namespace keyed by directory, so work on Project A never leaks into Project B.
- **LLM-optional.** Zero-LLM mode still gives you FTS5 search and rule-based summarisation. Add a provider (Anthropic, OpenAI, Ollama, Gemini, Copilot) only when you want consolidated wiki pages.
- **Built-in web UI.** A read-only browser view of the wiki with full-text search, project tree, and markdown rendering.
- **Multi-machine ready.** Run the server on a homelab box and connect multiple laptops to it.
How to set it up
Docker (quickest start)
The recommended path is Docker, which takes about two minutes:
```bash
1. Install the CLI wrapper
mkdir -p ~/.local/bin
curl -fsSL https://raw.githubusercontent.com/akitaonrails/ai-memory/main/bin/ai-memory -o ~/.local/bin/ai-memory
chmod +x ~/.local/bin/ai-memory
2. Start the server
docker run -d --name ai-memory --restart unless-stopped -p 127.0.0.1:49374:49374 -v ai-memory-data:/data -e AI_MEMORY_LLM_PROVIDER=openai -e OPENAI_API_KEY=sk-... akitaonrails/ai-memory:latest
3. Wire Claude Code
ai-memory install-mcp --client claude-code --apply
ai-memory install-hooks --agent claude-code --apply
4. Wire Codex (repeat for each agent)
ai-memory install-mcp --client codex --apply
ai-memory install-hooks --agent codex --apply
```
That is it. Start a Claude Code or Codex session as usual — every prompt and tool call now lands in ai-memory, and the next session in this project will include a handoff with where you left off.
Native Linux install (via AUR)
Arch Linux users can install directly:
```bash
yay -S ai-memory-bin
systemctl --user enable --now ai-memory.service
```
Configuring a local LLM provider
If you want to keep everything local, use Ollama as the LLM provider for ai-memory's wiki compilation:
```bash
docker run -d --name ai-memory --restart unless-stopped -p 127.0.0.1:49374:49374 -v ai-memory-data:/data -e AI_MEMORY_LLM_PROVIDER=openai-compat -e AI_MEMORY_LLM_MODEL=llama3.1 -e OPENAI_BASE_URL=http://host.docker.internal:11434/v1 -e OPENAI_API_KEY=ollama akitaonrails/ai-memory:latest
```
The wiki compilation task uses small, fast models (Haiku/Mini-class), not frontier reasoning models, so even a modest local setup can handle it comfortably.
Why this matters for the self-hosted AI community
The coding agent space is fragmenting fast. Claude Code, Codex, Cursor, Kimi Code, OpenCode, Gemini CLI, and others are all competing for attention. But every one of them has **zero built-in memory between sessions** — and none of them share context with each other.
ai-memory addresses this at the infrastructure layer rather than the agent layer. It treats the concept of "memory" as a cross-cutting service that any MCP-compatible agent can use, rather than a proprietary feature locked into one vendor.
This aligns with the broader philosophy of self-hosted AI: **own your data, own your context, own your workflows**. Rather than being locked into one agent's ecosystem, you can switch between agents freely, knowing they all share the same persistent project memory.
For the infrastructure side of running a memory server alongside your model runtime, see Docker Setup for Local AI Tools.
Practical use cases
- **Multi-agent workflows.** Use Claude Code for architecture design, Codex for implementation, and Kimi Code for review — all sharing the same project context through ai-memory. See Build a Two-Model Workflow with a Fast Model and a Reasoning Model for a related pattern applied to models rather than agents.
- **Intermittent development.** Work on a project in the evening with Claude Code, pick it up the next morning with Codex on a different machine. The handoff page tells the new agent exactly where you stopped.
- **Team knowledge sharing.** Run ai-memory on a homelab server and connect every developer's agent to it. The wiki becomes a shared, persistent project knowledge base that survives individual session history.
- **Audit trail.** Because the wiki is git-versioned markdown, you can browse the full evolution of decisions, experiments, and approaches with `git log`. This is valuable for debugging why a particular architecture choice was made.
What to watch out for
ai-memory is at **v0.2** — functional and well-documented, but still in active development. The core handoff and wiki compilation work reliably, but you should expect API changes and configuration churn as the project matures.
The server exposes an HTTP endpoint by default. For single-user laptop setups, binding to loopback (127.0.0.1) is sufficient. For homelab deployments, add bearer-token authentication as described in the project's security documentation.
Conclusion
Persistent cross-agent memory is one of those capabilities that seems obvious in retrospect. If you use AI coding agents seriously, the friction of repeating context between sessions adds up fast. ai-memory solves this cleanly with a well-designed Rust server, zero-friction hooks, and a plain-markdown wiki that you own completely. It is a worthwhile addition to any self-hosted AI coding setup.
FAQ
Does ai-memory store my code or prompts?
It stores prompts, tool calls, and session activity in a local wiki. Your source code is not captured unless it appears in prompts or tool outputs.
Can I run it without an LLM provider?
Yes. Zero-LLM mode still captures sessions, provides FTS5 search, and generates rule-based summaries. An LLM is only needed for richer wiki consolidation.
Which agents are supported?
Claude Code, Codex, OpenCode, Cursor, Gemini CLI, Antigravity CLI, OpenClaw, Oh My Pi (OMP), and Claude Desktop (MCP-only).
Do I need a vector database?
No. The wiki uses git-versioned markdown with FTS5 search. No vector database, no embedding pipeline, no infrastructure babysitting.
