External Memory Series (1 of 4) — Series hub · Start here for builders. Then 2 Productivity · 3 vs the diagram · 4 Governance
Background on petralian.com: The AI Memory Problem · Your Brain Was Not Built for This · Directing AI as Primary Engineer · Publishing Obsidian Drafts
Every AI coding session starts with a blank context window. The model does not remember last Tuesday's deploy lock rule, the webhook edge case you fixed in March, or the framework constraint that broke production once. If your operating model depends on the model "picking it up from the repo," you will re-pay discovery tax on every session.
I run production software this way—open-source tools like Gravio (AI quality scoring) and the stack behind petralian.com—with frontier models (Claude, GPT, and IDE agents) as the primary implementers. The system that makes that survivable is not a bigger context window. It is a three-layer external memory stack—files the human owns, the agent reads, and automation keeps fresh.
This article is the development-focused view: what the layers are, how they map to the popular "STM / LTM / feedback" mental model, what we automated in May 2026, and where the design is deliberately stronger than the diagram suggests.
The problem: three different failures, one word "memory"
When developers say "the AI forgot," they usually mean one of these:
| Failure | What breaks | Example |
|---|---|---|
| Session | No thread from yesterday's work | Agent re-implements a fix you already shipped |
| Operational | No record of open loops, deploy state, incidents | Two sessions deploy concurrently |
| Conceptual | No durable product/architecture truth | Feature rules live only in chat, not in docs |
Calling all three "memory" and hoping a larger context window fixes them is a category error. Context windows are short-term buffers. Production work needs durable stores with explicit promotion rules: what stays in chat, what gets written to disk, and what becomes a permanent rule.
That is the problem this system solves.
Why it matters for AI-first development
AI-first development here means: the agent writes most code; the human sets direction, reviews risk, and owns decisions. That model fails without external memory because:
- Local optimization — The agent makes a correct change in isolation that violates a constraint from two sessions ago.
- Handoff fragility — You close the laptop mid-refactor; the next session has no structured resume point.
- Audit gaps — You cannot explain why a deploy happened or which feature note was current at commit time.
McKinsey and GitHub have published large productivity gains for AI-assisted development, but those studies assume a human team carries tacit knowledge. Solo or small-team AI-first builds replace tacit knowledge with written continuity. The memory system is that replacement.
If you skip it, you still ship—until complexity crosses the threshold where every session feels like onboarding a new contractor.
What is actually happening: four tiers, not three
I describe the system as three layers for clarity. In practice there are four tiers, which is intentional.
flowchart TB
subgraph L1["Layer 1 — Short term"]
CHAT["Agent chat + todos"]
LIVE["Live workspace / git state"]
end
subgraph L2["Layer 2 — Operational"]
SNAP["last-session-bootstrap snapshot"]
HAND[".claude/NEXT_SESSION.md"]
MEM["memories/repo/open-loops.md"]
OPS["Obsidian Operations/*"]
end
subgraph L3["Layer 3 — Evergreen"]
FEAT["Features/*.md"]
ARCH["Architecture/*.md"]
BRAIN["00_Brain/Conventions/*"]
end
subgraph L4["Layer 4 — Feedback hardened"]
RULES["Agent instructions + rules"]
AGENTS["Custom agents + skills"]
HOOKS["Git + IDE session hooks"]
end
L1 -->|"session end protocol"| L2
L2 -->|"promote durable facts"| L3
L1 -->|"lessons → rules"| L4
L4 --> L1
L3 --> L1
Layer 1 — Short term (in tool and session)
Holds: Current conversation (Claude, ChatGPT, or your IDE agent), task lists, files open in the editor, uncommitted diffs.
Lifetime: Until the chat ends.
Does not hold: Anything you need next week.
This matches the infographic's "Short-Term Memory": current inputs, limited capacity, overwritten by default.
Layer 2 — Operational (session start / close / summaries)
Holds: What we are doing now and what blocks the next session.
| Store | Role |
|---|---|
Operations/Session Summaries.md | One-line trail per session |
Operations/Sessions/YYYY-MM-DD Topic.md | Full session narrative |
Operations/AI Session Bridge.md | Current priority + paste-in bootstrap |
.claude/NEXT_SESSION.md | Repo-local handoff (fast for agents) |
memories/repo/*.md | Machine-readable mirror of handoff + gotchas |
last-session-bootstrap.txt (or equivalent) | Git status + health checks (hook- or script-generated) |
Lifetime: Days to weeks; dated notes age out of "active" use but stay searchable.
This is the layer most diagrams skip. It is the shipping layer—the difference between "interesting chat" and "resume Monday without archaeology."
Layer 3 — Long term (human-readable context)
Holds: What the product is and how it must behave.
Examples from a live codebase:
Features/Scoring Engine.md— invariants, API rules, quality thresholds (Gravio-style)Architecture/Database Schema.md— models and migration posture00_Brain/AI Agent Methodology.md— universal session loop (all projects)
Lifetime: Evergreen; updated in place, linked in a graph (Zettelkasten-style), not buried in chat logs.
This maps to "Long-Term Memory" in the diagram—but implemented as files you edit, not weights inside the model.
Layer 4 — Feedback hardened (rules automation)
Holds: Lessons that must never be re-learned.
| Mechanism | What it does |
|---|---|
| Session End footer | Contract: every work reply lists deploy state, files changed, next priority |
| Self-improvements field | Must cite exact file path where a rule was written—or it did not happen |
Agent instructions (AGENTS.md, etc.) | Grows when bugs reveal missing guardrails |
| Custom agents | Handoff-writer, release-manager — role-specific memory writers |
IDE sessionStart hook (e.g. Cursor) | Runs session-start.ps1, writes bootstrap snapshot |
Git post-commit hook | Appends commit hash to matching Features/*.md via feature-note-map.json |
This is the infographic's Feedback Loop, implemented more strictly than "user corrects the model." The system requires artifacts.
How the solution works (May 2026 implementation)
Bootstrap order (start of session)
Non-trivial work follows a fixed read order—documented in your agent instructions file, AGENTS.md, and Obsidian Operations/Workflow.md:
.claude/NOTES.md+.claude/NEXT_SESSION.mdmemories/repo/index.md,open-loops.md,known-gotchas.md00_Brain/AI Agent Methodology.md+Conventions/Deploy Playbook.md- Project
Operations/AI Session Bridge.md+Session Summaries.md - Relevant
Features/*andArchitecture/*for the task
Automation added:
# Manual or hook-triggered
.\scripts\session-start.ps1
# git pull, git status, latest sentry/inbox/, worker health check
IDE session hook (e.g. Cursor .cursor/hooks.json → sessionStart):
- Runs the script above
- Writes a bootstrap snapshot file the agent can read on open
- May return
additional_contextpointing agents at the snapshot
Known constraint: Some IDE builds drop injected context on session start due to timing. Treat the snapshot file on disk plus an always-on project rule as the fallback. Do not rely on injection alone.
Session end (promotion rules)
At session end, the agent (or you) should:
- Finalize the dated session note
- Append one line to
Session Summaries.md - Update
AI Session Bridge/NEXT_SESSION.mdif priority changed - Promote durable facts to Feature or Architecture notes—not session notes
- Append the Session End footer (deploy tag, test plan, next priority)
Rule of thumb:
| If it matters… | Write to… |
|---|---|
| Only next session | NEXT_SESSION.md, Bridge |
| This week | Session note + Summaries |
| This product forever | Features/*.md, Architecture/*.md |
| Every project forever | 00_Brain/Conventions/* |
| Never again | Agent instructions or known-gotchas.md |
Commit → Feature note (automation)
npm run hooks:install # once per clone: core.hooksPath = githooks/
After each commit, scripts/update-feature-notes.ps1 maps changed paths to feature names and appends:
## Commits
- `abc1234` (2026-05-26) — fix(feed): description from git subject
Mapping lives in scripts/feature-note-map.json (extend when you add modules).
This closes the loop between git truth and product memory without asking the agent to remember to update Obsidian.
Dual vault MCP
The agent reads:
- Your project vault (e.g.
40_VSCode/<Project>/) — product truth for that repo - A shared brain vault (
00_Brain/) — universal methodology and conventions
Without brain access, every project reinvents deploy footers and session loops. That is a hard requirement in the retrofit checklist.
Comparison: my system vs the three-layer infographic
The popular diagram stacks Short-Term Memory → Long-Term Memory → Feedback Loops. Here is an honest scorecard.
| Diagram concept | My implementation | Match |
|---|---|---|
| STM (conversation, attention) | Chat + todos + live repo | High |
| LTM (persists across sessions) | Obsidian evergreen + repo rules | High |
| Feedback adjusts memory | Footer + instructions + hooks | Higher than diagram |
| Automatic STM → LTM transfer | Session-end protocol + git hook | Medium (discipline + partial automation) |
| In-model attention | Not used; files are the attention system | Different by design |
Overall resemblance: ~70% on concepts, ~40% on structure—because you added an operational layer and file-based LTM the diagram does not show.
flowchart LR
subgraph INFO["Infographic model"]
A[STM] --> B[LTM]
C[Feedback] --> A
C --> B
end
subgraph MINE["This system"]
D[Chat] --> E[Operational files]
E --> F[Evergreen notes]
D --> G[Rules + hooks]
G --> D
F --> D
end
What I did differently (on purpose):
- Operational memory as first-class — Summaries and Bridge are not "LTM"; they are the resume tape.
- Two speeds in the repo —
NEXT_SESSION.mdfor speed, Obsidian for humans and links. - Feedback is contractual — Footer fields with file-path proof, not implicit learning.
- Automation at boundaries — Session start hook, post-commit feature updater; not inside the model.
- Cross-project brain —
00_Brainsurvives when you switch repos.
What breaks if you skip a layer
| Skip | Symptom |
|---|---|
| Layer 1 only (chat) | Repeated mistakes, no deploy coordination |
| No Layer 2 | Cannot resume; open loops live in your head |
| No Layer 3 | Agents re-derive architecture from code every time |
| No Layer 4 | Same bug class returns; no deploy lock discipline |
Limitations (honest)
- Duplication risk —
NOTES.md,memories/repo/, and Obsidian can drift unless session end updates all relevant stores. - Agent discipline — Hooks reduce but do not eliminate skipped session notes.
- Feature map maintenance — New routes need patterns in
feature-note-map.json. - IDE hook quirks — Bootstrap snapshot file exists because injection is not 100% reliable.
What you can do next
Minimal (one afternoon):
- Add
.claude/NEXT_SESSION.mdwith Current Priority + Open Loops + Next 3 steps. - Add
Operations/Session Summaries.md(one line per session). - Paste the Session End footer template into your agent instructions.
Standard (one project weekend):
- Create
Features/*.mdfor your top three modules. - Add
scripts/session-start.ps1(git pull + status + any health checks you need). - Document bootstrap order in
AGENTS.mdor your agent instructions file.
Full (reference implementation):
- Dual-vault Obsidian + MCP paths.
npm run hooks:install+feature-note-map.json.- IDE
sessionStarthook (Cursor and others support this pattern).
The goal is not perfection on day one. The goal is never starting a session cold on a codebase that already hurt you once.
Reader action
Pick one production codebase you touch weekly. Before the next agent session, write a 10-line NEXT_SESSION.md and one Feature note for the module you are about to change. Run the agent once with an explicit instruction: "Read NEXT_SESSION first, then the Feature note, then propose a plan."
If that session feels faster than the last, the external brain is working. Automate the boundaries next—not the thinking.
Related reading
This series: 2 — Personal productivity · 3 — Why files beat the diagram · 4 — Audit and governance
Published on Petralian: The AI Memory Problem · Your Brain Was Not Built for This · What I Learned Directing AI as My Primary Engineer · Why I Rebuilt Petralian on Next.js · Why AI Agent Output Quality Drifts
Get practical posts on enterprise AI and transformation. Only useful updates, sent as a weekly digest.
One practical digest each week. Unsubscribe anytime.





