AI Engineer Melbourne
Knowledge Base
Keynote InsightsIntermediate 11 min

Why Your Coding Agent Forgets Everything

Context windows are not memory. Building real persistence is an architecture problem.

Introduction

Every coding agent session feels like Groundhog Day: you re-explain the codebase conventions, re-link the relevant tickets, re-paste the schema. The agent isn't stupid โ€” its memory is, by design, ephemeral. The fix isn't a bigger context window; it's an external memory architecture that survives across sessions, projects, and model swaps.

Why this matters

  • Coding agents that forget cost you the same setup tax on every interaction.
  • Bigger context windows make the problem cheaper but don't solve it โ€” irrelevant context is worse than no context.
  • Real productivity gains come from agents that accumulate knowledge of your codebase, your team's conventions, and prior decisions.
  • Memory architecture is the difference between a clever autocomplete and a genuine collaborator.

Core concepts

1

Three memory tiers

Short-term (current task context), working (this session's scratchpad and tool outputs), long-term (cross-session knowledge: codebase facts, team conventions, prior decisions). Each tier has different storage, retrieval, and decay characteristics.

2

Episodic vs. semantic memory

Episodic = "what happened" (this PR, this incident, this conversation). Semantic = "what's true" (this is how we name files, this is our auth pattern). Promote episodic to semantic only when the lesson generalises.

3

Retrieval over recall

Don't try to stuff everything into context. Build retrieval (vector + keyword + structural) so the agent fetches what it needs when it needs it.

4

Memory decay and contradiction

Old memories go stale. Newer signals should override older ones, contradictions should surface for human review, and confidence should drop with age.

Practical patterns

AGENTS.md / CLAUDE.md

A versioned, human-edited file at the repo root capturing conventions and "do this, not that" rules โ€” the agent's onboarding doc.

Decision log retrieval

Capture ADRs (Architecture Decision Records) and retrieve relevant ones when the agent enters a related area of code.

Lesson capture

When you correct the agent, capture the correction as a structured lesson with confidence, scope, and expiry.

Memory eval suite

Run a memory eval: same questions across sessions, with and without memory enabled, to measure whether memory is actually helping.

Pitfalls to avoid

  • Treating memory as a vector DB problem only โ€” graph and relational structure matter too.
  • Letting memory grow unbounded; old, wrong context is worse than no context.
  • Mixing user-provided memory with model-generated memory without provenance.
  • Optimising for recall metrics that don't correlate with task success.

Key takeaways

  1. 1Memory is an architecture, not a feature flag.
  2. 2Separate episodic from semantic, and promote deliberately.
  3. 3Retrieve, don't recall โ€” keep the context window clean.
  4. 4Measure whether memory is helping; if it isn't, remove it.

Go deeper ยท external resources

Curated reading list to take you from primer to practitioner. All links are external and free to read.

More from Keynote Insights