iTranslated by AI
Optimizing AI Agent Long-term Memory with session-brief
Introduction
Giving Codex or Claude Code long-term memory makes it easier to inherit previous decisions and work policies. I have been using an Obsidian-style Markdown vault to record design decisions, strategy notes, handoffs, and current goals.
"Codex" here refers to OpenAI's coding agent that can read and write to local repositories. Like Claude Code, I treat the context left as files with the assumption that it will be used for the next task.
However, increasing the memory caused a different problem. The files to be read at startup became larger, and the agent started using more context to "read past logs every time" than for the actual work.
In this article, I will talk about how I not only increased long-term memory for AI agents but also divided the entry point into a session-brief to keep the startup light. The target audience is individual developers who want to give AI coding agents continuous context but are struggling with context bloat.
Practical examples of related article management are available at harness17/zenn-articles. This article covers only the operational patterns that are safe to share, not the contents of my private vault itself.
Startup Became Heavier as Memory Increased
At first, I thought that simply increasing the long-term memory would be enough.
- Writing current goals in
goals.md - Leaving design decisions in
decisions/ - Leaving strategy notes in
strategies/ - Leaving handoffs between Codex and Claude Code in
handoffs/
This structure itself was useful. I became able to track why I made a design decision, which articles were awaiting publication, and which tasks were handed off to other agents, without being confined to conversation logs.
On the other hand, reading too many entry points at startup creates a different burden. If completed history remains in goals.md, a full index remains in decisions/index.md, and old handoffs are left directly in the directory, each orientation becomes heavier.
When I was actually using this, goals.md grew to about 36KB and decisions/index.md to about 11KB. Even if the content is useful, it is too large to read in its entirety at the beginning of every session. I had intended to provide long-term memory, but I ended up reducing the free space available for immediate tasks.
Using session-brief as a Startup Entry Point
Therefore, I split the files to be read at startup into session-brief.
session-brief is a thin brief that indicates "what to read as an entry point" rather than a detailed history. I put only the current focus, reading policy, and compression maintenance criteria there, and adopted a format of moving to detailed notes only when necessary.
# session-brief
## Reading Policy
- The primary entry point at startup is this file, reminders, and only the beginning of goals if necessary.
- Do not expand goals, decisions/index, or handoffs in full.
- Read only the relevant links or recent files.
- For session log investigations, extract them in a structured manner and do not stream massive search results into the conversation.
## Current Focus
- Article and post-publication verification
- Distribution flow for individual development apps
- Lightweight maintenance of Skill Graph
What was important here was not to make session-brief a "dumping ground for summaries." If you make the summaries too thick, the startup entry point becomes bloated again.
I do not write details in the main text and only read the link targets for tasks that require details. By thoroughly doing this, I have been able to suppress the amount of reading at startup without having to delete old decisions.
Moving Old Data to Archive Instead of Deleting
If the only goal was to lighten the context, one could simply delete old history. However, my objective is not to "forget." I want to be in a state where I can explain later why a decision was made or which technology area an article belongs to.
Therefore, I adopted a policy of moving data to an archive instead of deleting it.
memory/
├── self/
│ ├── session-brief.md # Startup entry point
│ └── goals.md # Focused on active items
├── decisions/
│ └── index.md # Lightweight map
├── ops/
│ └── handoffs/ # Only recent/ongoing items
└── archive/
├── goals/ # Completed history
├── decisions/ # Full index or old details
└── handoffs/ # Completed handoffs
With this organization, goals.md shrank from approximately 36KB to about 2KB, and decisions/index.md from approximately 11KB to about 2.5KB.
More important than the reduction in file size itself is the clarification of the entry point's role. At startup, I read only "what I need to see now," and I fetch past details through searches or links. I was able to decouple long-term memory from the working context of each session.
Deciding What Not to Read at Startup
Another thing I decided after the compression was "what not to read at startup."
In memory management for AI agents, it was more important to decide what not to make them read every time than what to read. I categorize them as follows:
| Type | Handling at Startup | When Needed |
|---|---|---|
session-brief |
Read | Always used as an entry point |
| reminders | Read | Used for checking deadlines |
| goals | Read only the necessary beginning | Read details when relevant to the current task |
| decisions/index | Do not expand fully | Search and read relevant decisions |
| handoff | Look at recent ones only | Read relevant targets during ongoing work or reviews |
| session logs | Do not read | Extract structured data only when investigating |
Since I started using this method, there have been fewer instances where current work gets blurred by being pulled into old information.
Long-term memory is not meant to be read entirely every time. It is sufficient if it can be searched and retrieved when needed. If you do not distinguish this, the more memory you add, the heavier your work becomes.
Establishing Rules for Compression
Even after making it lightweight once, it will grow again with use. Therefore, I also established rules for when to perform compression.
## Maintenance Criteria for Compression
- When goals.md exceeds 8KB, move completed history to archive/goals/.
- When decisions/index.md exceeds 8KB, move the full index to archive/decisions/.
- Keep only recent or ongoing items directly under handoffs/.
- Do not append detailed logs to goals; instead, separate them into decision, strategy, or archive.
These criteria are not for strict performance tuning but act as operational warning lines. It does not mean things will break if the file size exceeds 8KB. However, in my operations, that is around the point where I start feeling that it is "too heavy to read at startup."
Having established criteria allows compression to be more than just cleaning up on a whim. After a series of article publications, or after finishing an implementation spanning multiple sessions, or when handoffs increase, it becomes easier to decide to review only the entry points.
Conclusion
Long-term memory for AI agents becomes difficult to handle if you only keep adding to it.
In my case, I separated the startup entry point into session-brief and moved completed histories and full indices to an archive. By separating what to read at startup, what not to read, and what to fetch only when necessary, I found it easier to achieve both searchability of memory and lightweight operation during work.
When building long-term memory, it is effective to first decide not just "what to keep," but "how to make the entry point thin enough to avoid reading everything every time."
Reference Links
- harness17/zenn-articles: A repository for managing articles and pre-publication review workflows.
- Model Context Protocol: A specification for connecting local knowledge and external tools to AI agents.
- Related Article: Leaving Design Decisions with AI in My-Skill-Graph for Reuse
- Related Article: Practical Notes on Managing Zenn Articles in a Repository and Running Pre-publication Reviews (Scheduled to be published simultaneously)
Discussion