The Problem
Claude Code sessions are ephemeral. Every time you start a new conversation, context from previous sessions is lost. File-based memory (markdown files in ~/.claude/) helps, but it's limited — flat text with no semantic search, no cross-machine access, and no way to query by relevance. When you're working across multiple machines (a development server and a local PC), the gap becomes even wider.
We wanted three things:
- Semantic memory — store facts, decisions, and project context with vector embeddings so they can be retrieved by meaning, not just keywords
- Persistence — memories survive across sessions, reboots, and machine changes
- Remote access — the same memory system accessible from any machine running Claude Code
Architecture Overview
Claude Code (any machine)
↓ MCP (stdio local / Streamable HTTP remote)
claude-memory server (Node.js + TypeScript)
↓ SQL + vector search
Supabase (PostgreSQL + pgvector)
↓ embeddings
Voyage AI (voyage-3 model)
claude-memory is an MCP (Model Context Protocol) server that exposes 8 tools to Claude Code:
store_memory— save a memory with type, importance, tags, and auto-generated embeddingssearch_memory— semantic vector search with optional filters (type, tags, importance, recency bias)list_memories— browse by recency, importance, or access frequencyforget_memory— soft-delete outdated memoriessummarize_session— persist a session summarysave_session_link— link a session UUID to a label for later retrievalfind_session— look up a past session by labelmemory_stats— dashboard of memory system health
The Database Layer
We used the existing Supabase instance (already running for AgentCRM) and added a claude_memory table with pgvector support:
CREATE EXTENSION IF NOT EXISTS vector;
CREATE TABLE IF NOT EXISTS claude_memory (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
memory_type TEXT NOT NULL,
content TEXT NOT NULL,
title TEXT,
tags TEXT[] DEFAULT '{}',
source TEXT,
session_id TEXT,
project TEXT,
importance INTEGER DEFAULT 5,
embedding vector(1024),
-- ... timestamps, access tracking, expiry
);
The embedding column stores 1024-dimensional vectors generated by Voyage AI's voyage-3 model. When a memory is stored, the content is sent to Voyage AI to generate an embedding. When a search is performed, the query is embedded the same way and compared using cosine similarity against all stored memories.
Memory types include: fact, decision, code_pattern, project_context, user_preference, session_summary, debug_insight, architecture, and tool_usage. Each memory has an importance score (1–10) that can be used for filtering and ranking.
The MCP Server
The server (~/claude-memory/src/) is built with:
- @modelcontextprotocol/sdk — MCP server framework
- @supabase/supabase-js — database client
- Express — HTTP server for remote access
- Zod — config validation
- Voyage AI — embedding generation
Dual Transport
The key design decision was supporting two transports:
Stdio — for local Claude Code on the same machine. The server runs as a child process, communicating over stdin/stdout. Zero network overhead, no auth needed.
{
"claude-memory": {
"command": "node",
"args": ["/home/rdpuser/claude-memory/dist/index.js", "--stdio"]
}
}
Streamable HTTP — for remote Claude Code instances. The server listens on port 3002 with a single /mcp endpoint that handles all MCP traffic. Protected by bearer token authentication.
// Single endpoint handles POST (new/existing sessions), GET (SSE notifications), DELETE (cleanup)
app.all('/mcp', bearerAuth, async (req, res) => {
if (req.method === 'POST') {
const sessionId = req.headers['mcp-session-id'];
if (sessionId && sessions.has(sessionId)) {
// Existing session
await sessions.get(sessionId).transport.handleRequest(req, res);
} else {
// New session
const transport = new StreamableHTTPServerTransport({ sessionIdGenerator: () => crypto.randomUUID() });
const server = createServer();
await server.connect(transport);
await transport.handleRequest(req, res);
sessions.set(transport.sessionId, { server, transport });
}
}
});
We initially used SSE transport (separate /sse and /messages endpoints), but Cloudflare's proxy killed the long-lived SSE connection before POST messages arrived, causing 400 errors. Streamable HTTP solved this by using standard request/response cycles on a single endpoint.
The transport is selected by a command-line flag: --stdio for local, omit for HTTP.
Exposing to the Internet
The Challenge
The server's public IP is actually a hosting provider proxy that only forwards ports 80 and 443. The real machine IP isn't directly reachable. Opening port 3002 in ufw was necessary but insufficient — the provider's proxy doesn't forward non-standard ports.
This also meant Let's Encrypt ACME challenges failed — both TLS-ALPN-01 and HTTP-01 challenges were intercepted by the proxy, returning errors instead of reaching Caddy.
The Solution
We used the same pattern already working for AgentCRM:
- Registered
gitdiot.comand added it to Cloudflare - Created an A record for
fragrag.gitdiot.compointing to the server IP, with Cloudflare proxy (orange cloud) enabled - Configured Caddy with
tls internal— Cloudflare handles public TLS, Caddy uses a self-signed cert for the Cloudflare-to-origin connection
fragrag.gitdiot.com {
tls internal
reverse_proxy localhost:3002
log {
output file /var/log/caddy/fragrag.log
}
}
4. Set Cloudflare SSL mode to "Full" — trusts the origin's self-signed cert
The traffic flow: Client → Cloudflare (TLS) → Caddy:443 (internal TLS) → Node:3002
Persistence with systemd
To survive reboots, we created a systemd service:
[Unit]
Description=Claude Memory RAG MCP Server
After=network.target
[Service]
Type=simple
User=rdpuser
WorkingDirectory=/home/rdpuser/claude-memory
ExecStart=/home/rdpuser/.nvm/versions/node/v22.22.1/bin/node dist/index.js
Restart=on-failure
RestartSec=5
EnvironmentFile=/home/rdpuser/claude-memory/.env
[Install]
WantedBy=multi-user.target
The EnvironmentFile directive loads Supabase credentials, the Voyage AI key, and the bearer token from .env without hardcoding them in the service file.
Remote Client Configuration
MCP servers in Claude Code are registered via the CLI, not through settings.json. We learned this the hard way — adding mcpServers to the settings file had no effect. The correct approach:
claude mcp add claude-memory https://fragrag.gitdiot.com/mcp -t http -s user \
-H "Authorization:Bearer cm-..."
Key flags:
-t http— Streamable HTTP transport (notsse)-s user— stores in~/.claude.json, persists across projects-H— bearer token for auth
One gotcha: the CLI sometimes inserts a newline in the Authorization header value. If claude mcp list shows "Failed to connect", check ~/.claude.json and ensure the header is a single line.
The portable config repo (gitDiot-Org/claude-cli-config) contains settings, skills, and memories deployed via git pull && ./install.sh.
Project Structure
claude-memory/
├─ .env # secrets (gitignored)
├─ .gitignore
├─ package.json
├─ tsconfig.json
├─ sql/
│ └─ 001_claude_memory.sql # Supabase migration
├─ scripts/
│ └─ migrate.ts # Migration runner
└─ src/
├─ index.ts # Dual-transport MCP server
├─ config.ts # Zod-validated env config
├─ auth.ts # Bearer token middleware
├─ embeddings.ts # Voyage AI integration
└─ tools/
├─ memory.ts # store, search, list, forget, stats
└─ sessions.ts # summarize, save_link, find
How It Works in Practice
At the start of a session, Claude Code can call search_memory with the current task context to load relevant prior knowledge:
search_memory("CRM bot architecture")
→ AgentCRM Architecture (score: 0.710) — full architecture details
During work, important discoveries get stored:
store_memory("SSH key works with gitDiot-Org", type="fact", importance=6)
The vector search returns results ranked by semantic similarity, not keyword matching. Searching for "how to authenticate with GitHub" returns the SSH key fact (score: 0.598) even though the memory never mentions "authenticate."
Stale memories can be soft-deleted and replaced with updated versions, keeping the knowledge base current.
Security Considerations
- Bearer token auth on the
/mcpendpoint - Health check is the only unauthenticated endpoint
- Secrets (Supabase keys, Voyage AI key, bearer token) stored in
.env, excluded from git - The bearer token is stored in the portable config repo, which is private — acceptable tradeoff for a personal tool
- Cloudflare proxy hides the origin server IP and provides DDoS protection
- TLS end-to-end — Cloudflare to client, internal cert from Cloudflare to origin
Bonus: Two Claudes, One Git Repo
The most unexpected outcome was using the git repos as a communication channel between two Claude Code instances — one on the remote server, one on the local PC.
Both instances poll the repos every 60 seconds using cron jobs. When one pushes a commit, the other pulls it, reads the changes, and acts on them. This created a feedback loop:
- Desktop Claude diagnosed the SSE transport failure and identified that
express.json()middleware was breakingStreamableHTTPServerTransport - Desktop Claude pushed the fix to
gitDiot-Org/fragRag - Server Claude's polling loop detected the new commit within a minute
- Server Claude pulled, rebuilt (
npx tsc), restarted the systemd service, verified the health check, and pushed a status commit back - Desktop Claude saw the confirmation and tested the connection
Two AI agents collaborating asynchronously through version control, each with access to different parts of the infrastructure. The server Claude could restart services and check logs; the desktop Claude could test the client connection and iterate on fixes. Git provided the audit trail.
Desktop Claude Server Claude
│ │
├── push fix ──────────────────────>│
│ ├── pull
│ ├── rebuild
│ ├── restart service
│ ├── verify health
│<─────────────────── push status ──┤
├── test connection │
├── confirm working │
│ │
What We Built
A personal knowledge graph for an AI coding assistant. Memories persist across sessions and machines. The semantic search means you don't need to remember exact keywords — describe what you're looking for and the vector similarity finds it. The dual-transport architecture means zero overhead when working locally, with full remote access when needed.
Total infrastructure: one Supabase table, one Node.js process, one Caddy route, one Cloudflare DNS record, one systemd service, one cron poll. No Kubernetes, no Lambda, no orchestration layer. Simple enough to debug with curl and journalctl — and apparently, simple enough for two AI agents to debug collaboratively through git commits.
Related: fragRag: Building Persistent Memory for Claude Code Across Machines — the journey narrative of what failed and what we learned. And Debugging Across Machines — how two Claude instances collaborated through Git to fix this server.