← Back to all posts

AI & Technology

Three Layers of External Memory for AI-First Development (What Actually Ships)

Chat context is not memory. A three-layer file system—session, operational, evergreen—plus hooks and git automation is how I keep production codebases coherent across hundreds of agent sessions.

·11 min read
AI memoryagentic developmentObsidiandeveloper-toolscontext engineeringExternal Memory Series
Three Layers of External Memory for AI-First Development (What Actually Ships)

External Memory Series (1 of 4)Series hub · Start here for builders. Then 2 Productivity · 3 vs the diagram · 4 Governance
Background on petralian.com: The AI Memory Problem · Your Brain Was Not Built for This · Directing AI as Primary Engineer · Publishing Obsidian Drafts

Every AI coding session starts with a blank context window. The model does not remember last Tuesday's deploy lock rule, the webhook edge case you fixed in March, or the framework constraint that broke production once. If your operating model depends on the model "picking it up from the repo," you will re-pay discovery tax on every session.

I run production software this way—open-source tools like Gravio (AI quality scoring) and the stack behind petralian.com—with frontier models (Claude, GPT, and IDE agents) as the primary implementers. The system that makes that survivable is not a bigger context window. It is a three-layer external memory stack—files the human owns, the agent reads, and automation keeps fresh.

This article is the development-focused view: what the layers are, how they map to the popular "STM / LTM / feedback" mental model, what we automated in May 2026, and where the design is deliberately stronger than the diagram suggests.


The problem: three different failures, one word "memory"

When developers say "the AI forgot," they usually mean one of these:

FailureWhat breaksExample
SessionNo thread from yesterday's workAgent re-implements a fix you already shipped
OperationalNo record of open loops, deploy state, incidentsTwo sessions deploy concurrently
ConceptualNo durable product/architecture truthFeature rules live only in chat, not in docs

Calling all three "memory" and hoping a larger context window fixes them is a category error. Context windows are short-term buffers. Production work needs durable stores with explicit promotion rules: what stays in chat, what gets written to disk, and what becomes a permanent rule.

That is the problem this system solves.


Why it matters for AI-first development

AI-first development here means: the agent writes most code; the human sets direction, reviews risk, and owns decisions. That model fails without external memory because:

  • Local optimization — The agent makes a correct change in isolation that violates a constraint from two sessions ago.
  • Handoff fragility — You close the laptop mid-refactor; the next session has no structured resume point.
  • Audit gaps — You cannot explain why a deploy happened or which feature note was current at commit time.

McKinsey and GitHub have published large productivity gains for AI-assisted development, but those studies assume a human team carries tacit knowledge. Solo or small-team AI-first builds replace tacit knowledge with written continuity. The memory system is that replacement.

If you skip it, you still ship—until complexity crosses the threshold where every session feels like onboarding a new contractor.


What is actually happening: four tiers, not three

I describe the system as three layers for clarity. In practice there are four tiers, which is intentional.

flowchart TB
  subgraph L1["Layer 1 — Short term"]
    CHAT["Agent chat + todos"]
    LIVE["Live workspace / git state"]
  end

  subgraph L2["Layer 2 — Operational"]
    SNAP["last-session-bootstrap snapshot"]
    HAND[".claude/NEXT_SESSION.md"]
    MEM["memories/repo/open-loops.md"]
    OPS["Obsidian Operations/*"]
  end

  subgraph L3["Layer 3 — Evergreen"]
    FEAT["Features/*.md"]
    ARCH["Architecture/*.md"]
    BRAIN["00_Brain/Conventions/*"]
  end

  subgraph L4["Layer 4 — Feedback hardened"]
    RULES["Agent instructions + rules"]
    AGENTS["Custom agents + skills"]
    HOOKS["Git + IDE session hooks"]
  end

  L1 -->|"session end protocol"| L2
  L2 -->|"promote durable facts"| L3
  L1 -->|"lessons → rules"| L4
  L4 --> L1
  L3 --> L1

Layer 1 — Short term (in tool and session)

Holds: Current conversation (Claude, ChatGPT, or your IDE agent), task lists, files open in the editor, uncommitted diffs.

Lifetime: Until the chat ends.

Does not hold: Anything you need next week.

This matches the infographic's "Short-Term Memory": current inputs, limited capacity, overwritten by default.

Layer 2 — Operational (session start / close / summaries)

Holds: What we are doing now and what blocks the next session.

StoreRole
Operations/Session Summaries.mdOne-line trail per session
Operations/Sessions/YYYY-MM-DD Topic.mdFull session narrative
Operations/AI Session Bridge.mdCurrent priority + paste-in bootstrap
.claude/NEXT_SESSION.mdRepo-local handoff (fast for agents)
memories/repo/*.mdMachine-readable mirror of handoff + gotchas
last-session-bootstrap.txt (or equivalent)Git status + health checks (hook- or script-generated)

Lifetime: Days to weeks; dated notes age out of "active" use but stay searchable.

This is the layer most diagrams skip. It is the shipping layer—the difference between "interesting chat" and "resume Monday without archaeology."

Layer 3 — Long term (human-readable context)

Holds: What the product is and how it must behave.

Examples from a live codebase:

  • Features/Scoring Engine.md — invariants, API rules, quality thresholds (Gravio-style)
  • Architecture/Database Schema.md — models and migration posture
  • 00_Brain/AI Agent Methodology.md — universal session loop (all projects)

Lifetime: Evergreen; updated in place, linked in a graph (Zettelkasten-style), not buried in chat logs.

This maps to "Long-Term Memory" in the diagram—but implemented as files you edit, not weights inside the model.

Layer 4 — Feedback hardened (rules automation)

Holds: Lessons that must never be re-learned.

MechanismWhat it does
Session End footerContract: every work reply lists deploy state, files changed, next priority
Self-improvements fieldMust cite exact file path where a rule was written—or it did not happen
Agent instructions (AGENTS.md, etc.)Grows when bugs reveal missing guardrails
Custom agentsHandoff-writer, release-manager — role-specific memory writers
IDE sessionStart hook (e.g. Cursor)Runs session-start.ps1, writes bootstrap snapshot
Git post-commit hookAppends commit hash to matching Features/*.md via feature-note-map.json

This is the infographic's Feedback Loop, implemented more strictly than "user corrects the model." The system requires artifacts.


How the solution works (May 2026 implementation)

Bootstrap order (start of session)

Non-trivial work follows a fixed read order—documented in your agent instructions file, AGENTS.md, and Obsidian Operations/Workflow.md:

  1. .claude/NOTES.md + .claude/NEXT_SESSION.md
  2. memories/repo/index.md, open-loops.md, known-gotchas.md
  3. 00_Brain/AI Agent Methodology.md + Conventions/Deploy Playbook.md
  4. Project Operations/AI Session Bridge.md + Session Summaries.md
  5. Relevant Features/* and Architecture/* for the task

Automation added:

# Manual or hook-triggered
.\scripts\session-start.ps1
# git pull, git status, latest sentry/inbox/, worker health check

IDE session hook (e.g. Cursor .cursor/hooks.jsonsessionStart):

  • Runs the script above
  • Writes a bootstrap snapshot file the agent can read on open
  • May return additional_context pointing agents at the snapshot

Known constraint: Some IDE builds drop injected context on session start due to timing. Treat the snapshot file on disk plus an always-on project rule as the fallback. Do not rely on injection alone.

Session end (promotion rules)

At session end, the agent (or you) should:

  1. Finalize the dated session note
  2. Append one line to Session Summaries.md
  3. Update AI Session Bridge / NEXT_SESSION.md if priority changed
  4. Promote durable facts to Feature or Architecture notes—not session notes
  5. Append the Session End footer (deploy tag, test plan, next priority)

Rule of thumb:

If it matters…Write to…
Only next sessionNEXT_SESSION.md, Bridge
This weekSession note + Summaries
This product foreverFeatures/*.md, Architecture/*.md
Every project forever00_Brain/Conventions/*
Never againAgent instructions or known-gotchas.md

Commit → Feature note (automation)

npm run hooks:install   # once per clone: core.hooksPath = githooks/

After each commit, scripts/update-feature-notes.ps1 maps changed paths to feature names and appends:

## Commits
- `abc1234` (2026-05-26) — fix(feed): description from git subject

Mapping lives in scripts/feature-note-map.json (extend when you add modules).

This closes the loop between git truth and product memory without asking the agent to remember to update Obsidian.

Dual vault MCP

The agent reads:

  • Your project vault (e.g. 40_VSCode/<Project>/) — product truth for that repo
  • A shared brain vault (00_Brain/) — universal methodology and conventions

Without brain access, every project reinvents deploy footers and session loops. That is a hard requirement in the retrofit checklist.


Comparison: my system vs the three-layer infographic

The popular diagram stacks Short-Term Memory → Long-Term Memory → Feedback Loops. Here is an honest scorecard.

Diagram conceptMy implementationMatch
STM (conversation, attention)Chat + todos + live repoHigh
LTM (persists across sessions)Obsidian evergreen + repo rulesHigh
Feedback adjusts memoryFooter + instructions + hooksHigher than diagram
Automatic STM → LTM transferSession-end protocol + git hookMedium (discipline + partial automation)
In-model attentionNot used; files are the attention systemDifferent by design

Overall resemblance: ~70% on concepts, ~40% on structure—because you added an operational layer and file-based LTM the diagram does not show.

flowchart LR
  subgraph INFO["Infographic model"]
    A[STM] --> B[LTM]
    C[Feedback] --> A
    C --> B
  end

  subgraph MINE["This system"]
    D[Chat] --> E[Operational files]
    E --> F[Evergreen notes]
    D --> G[Rules + hooks]
    G --> D
    F --> D
  end

What I did differently (on purpose):

  1. Operational memory as first-class — Summaries and Bridge are not "LTM"; they are the resume tape.
  2. Two speeds in the repoNEXT_SESSION.md for speed, Obsidian for humans and links.
  3. Feedback is contractual — Footer fields with file-path proof, not implicit learning.
  4. Automation at boundaries — Session start hook, post-commit feature updater; not inside the model.
  5. Cross-project brain00_Brain survives when you switch repos.

What breaks if you skip a layer

SkipSymptom
Layer 1 only (chat)Repeated mistakes, no deploy coordination
No Layer 2Cannot resume; open loops live in your head
No Layer 3Agents re-derive architecture from code every time
No Layer 4Same bug class returns; no deploy lock discipline

Limitations (honest)

  • Duplication riskNOTES.md, memories/repo/, and Obsidian can drift unless session end updates all relevant stores.
  • Agent discipline — Hooks reduce but do not eliminate skipped session notes.
  • Feature map maintenance — New routes need patterns in feature-note-map.json.
  • IDE hook quirks — Bootstrap snapshot file exists because injection is not 100% reliable.

What you can do next

Minimal (one afternoon):

  1. Add .claude/NEXT_SESSION.md with Current Priority + Open Loops + Next 3 steps.
  2. Add Operations/Session Summaries.md (one line per session).
  3. Paste the Session End footer template into your agent instructions.

Standard (one project weekend):

  1. Create Features/*.md for your top three modules.
  2. Add scripts/session-start.ps1 (git pull + status + any health checks you need).
  3. Document bootstrap order in AGENTS.md or your agent instructions file.

Full (reference implementation):

  1. Dual-vault Obsidian + MCP paths.
  2. npm run hooks:install + feature-note-map.json.
  3. IDE sessionStart hook (Cursor and others support this pattern).

The goal is not perfection on day one. The goal is never starting a session cold on a codebase that already hurt you once.


Reader action

Pick one production codebase you touch weekly. Before the next agent session, write a 10-line NEXT_SESSION.md and one Feature note for the module you are about to change. Run the agent once with an explicit instruction: "Read NEXT_SESSION first, then the Feature note, then propose a plan."

If that session feels faster than the last, the external brain is working. Automate the boundaries next—not the thinking.


Related reading

This series: 2 — Personal productivity · 3 — Why files beat the diagram · 4 — Audit and governance

Published on Petralian: The AI Memory Problem · Your Brain Was Not Built for This · What I Learned Directing AI as My Primary Engineer · Why I Rebuilt Petralian on Next.js · Why AI Agent Output Quality Drifts