Show HN: Lore – LLM proxy for coding agent context and memory management

loading… Lore. The memory that compounds.

Stop re-explaining
your project to
your AI.

Your team's memory, in every session. Lore gives AI agents persistent shared context — capturing decisions, file paths, and patterns across sessions lasting days and hundreds of turns. No context files to maintain. No workflow changes.

$ curl -fsSL https://withlore.ai/install | bash Copied!

or npx @loreai/gateway

Gradient Context

Lore Distillation

Any Provider*

Recall Tool

.lore.md Sync

On-Device Vector Search

Import History

Cost-Aware Caching

Sessions Lasting Days

+67%

vs Compaction at 2.3M Tokens

2.6x

Total Recall vs Compaction

2.3M+

Token Sessions Tested

2.3M tokens, 5 days, 2.6x total recall ◆ Compaction: 2.4/5. Lore: 4.0/5. 68 min/day re-explaining ◆ Lore remembers for you Your tools change. Your memory doesn't. ◆ Lore is your constant Total amnesia on new sessions ◆ Lore persists across sessions 49 manual learnings ◆ Lore curates automatically 5 feedback loops ◆ Your agent improves every session 2.3M tokens, 5 days, 2.6x total recall ◆ Compaction: 2.4/5. Lore: 4.0/5. 68 min/day re-explaining ◆ Lore remembers for you Your tools change. Your memory doesn't. ◆ Lore is your constant Total amnesia on new sessions ◆ Lore persists across sessions 49 manual learnings ◆ Lore curates automatically 5 feedback loops ◆ Your agent improves every session

The Problem

Context loss is invisible.

There's no error message when your AI forgets. Just worse answers, undone decisions, and hours spent re-explaining.

Compaction destroys details

When the context window fills up, your AI tool compacts the conversation. In a real 5-day coding session, compaction reduces 2.3 million tokens to an 11K summary — a 200x compression that loses which issues were picked, what alternatives were rejected, and why. It scores 2.4/5 on recall. Lore scores 4.0/5.

Starting fresh is starting from zero

Most developers see "Compacting conversation" and start a new session. That trades compaction for total amnesia. The new session produces output that looks fine — but it's working from incomplete information, and you can't tell.

Manual context files don't scale

The alternative is maintaining context files, key technical learnings, and decision rationales — by hand. It works, but it's a second full-time job. One team tracked 49 technical learnings manually. Every decision needs the "why" or the AI will refactor it away.

The Solution

How Lore replaces all of that

Intercept

Lore sits between your AI client and the upstream API. It captures every message — no client changes needed, just change the base URL. Works with Claude Code, OpenCode, Pi, Codex, and any Anthropic/OpenAI-compatible tool.

Distill

Lore replaces compaction entirely. Instead of lossy summaries that forget your file paths and decisions, it distills conversations into timestamped observation logs — the operational details your AI actually needs to keep working. Your manual "Key Technical Learnings"? Lore extracts and maintains them automatically.

Recall

Details from every session are searchable — even hundreds of turns later. When the distilled context isn't enough, your agent's recall tool retrieves the exact file path, error message, or decision rationale it needs. In our 2.3M-token benchmark: 2.6x total recall over compaction — 13 perfect scores vs 5.

Why not both?

Context management and memory are the same problem.

Other tools force you to solve them separately. Lore treats them as one continuous pipeline. See how Lore compares →

Memory alone isn't enough

Storing past conversations and searching them later is only half the problem. If your AI still gets compacted mid-session and loses track of what it's doing right now, a memory layer can't help — it doesn't know what's missing until you ask. Memory is only useful if it reaches the AI at the right time.

Context management alone doesn't learn

Compressing conversation history keeps the current session alive, but nothing is extracted from the compression. Start a new session and you're back to zero. Switch tools and the knowledge stays behind. Nothing transfers to other projects, team members, or even other models.

Lore connects them into one pipeline

In Lore, context compression is the memory pipeline. Distillation feeds the gradient context manager, which feeds the knowledge curator, which feeds .lore.md — and with Folk Lore, your team. Every conversation makes every future session smarter, across any provider, any tool, any team member. Every new session starts with the relevant facts and gets a fresh injection after the first turn. Read the docs →

Persistence

Decisions stick

Your AI won't refactor away deliberate decisions. Lore preserves the "why" behind every choice — the exact thing that prevents a future session from "helpfully" replacing your workaround with the broken approach it was working around.

Learning

Your AI learns from experience

Five feedback loops — behavioral pattern detection, semantic clustering, instruction capture, LLM-mediated curation, and adaptive calibration — compound across sessions. Patterns, gotchas, and decisions are automatically curated and exported to .lore.md, git-portable and model-agnostic. What researchers call "harness self-improvement" — Lore does it automatically.

Simplicity

Zero workflow changes

No context files to write. No instructions to maintain. No sessions to carefully manage. Install Lore, point your client at it, and keep working exactly as you do now. It extracts decisions, gotchas, and patterns — and preserves the "why" behind every choice — so you don't have to.

# Start Lore + your AI agent $ lore run

# That's it. Lore auto-detects your # agent and configures everything.

Cost

Sessions as long as you want

Work for days, hundreds of turns, millions of tokens — memory stays sharp. Tested on a real 5-day, 2.3M-token session. Background distillation and curation run at 50% off via batch APIs on cheaper models. Predictive cache warming avoids the expensive cache rebuilds that compaction triggers on every turn.

Free on-device vector search

Nomic Embed v1.5 runs locally — zero API cost, zero latency. Hybrid architecture fuses vector similarity with BM25 keyword search for best-of-both-worlds recall. LLM-powered query expansion finds what you mean, not just what you typed.

Knowledge

Project-specific and global memory

Lore maintains both project-level knowledge (architecture, gotchas, conventions) and global preferences that follow you across all projects. Confidence-ranked entries are injected into every session at the right priority — your AI always knows how you like to work.

Compatibility

Your constant

Claude Code, OpenCode, Pi, Codex — any tool that speaks Anthropic or OpenAI protocols. No new terminal, IDE, or cloud platform to adopt — just a one-line base-URL change. The engine runs locally and is fair source; your data stays on your machine. Switch providers, switch tools, switch machines — your memory travels with you. No vendor lock-in, no walled gardens. * Any provider accessible via an OpenAI or Anthropic-compatible API.

Migration

Import your history, start smart

Lore imports conversations from Claude Code, Codex, Aider, Cline, Continue, OpenCode, and Pi — extracting knowledge from your existing sessions so your AI starts with context from day one. No blank slate.

# Import existing conversations $ lore import

# Auto-detects Claude Code, Codex, # Aider, Cline, Continue & more

What the industry and research are discovering

The problem is real. The manual fix is exhausting.

O'Reilly Radar

"Your context files end up being the project documentation you should have been writing all along, except now there's something on the other end that will actually go wrong if you skip it."

Andrew Stellman — on why context management is the most important undiscussed skill in AI-driven development.

The Pragmatic Engineer

"I keep making this error when I use agentic mode... even after solving a problem with the tool, the tool 'forgets' all this detail. Unlike humans who learn from previous actions, this is not currently true of AI agents."

Gergely Orosz — on Microsoft's internal experience with AI coding tools and the gap between demos and real codebases.

The Research

"The performance of LLM systems depends not only on model weights, but also on their harness: the code that determines what information to store, retrieve, and present to the model."

Meta-Harness (Stanford, 2026) — on why optimizing what context the model sees matters as much as the model itself. Lore is this optimization layer for coding agents.

The future of AI agents

Stop managing context. Start building.

Lore is free and local today. Folk Lore brings your team's memory together — shared, searchable, always current. Early access is rolling out. Be part of the myth.

Models will commoditize. The context harness will differentiate. Lore is that harness — the context engine every agent needs to build the future.

No spam. Unsubscribe anytime.

✓ You're on the list. We'll be in touch.

Lore.AI — Shared Context for AI Agents

Stop re-explaining your project to your AI.