iTranslated by AI
Addressing the 'Lost-in-the-Middle' Phenomenon in ClaudeCode Workflows
Record of Countermeasures for Lost in the Middle and ClaudeCode Workflows
⚠️ Note
The "Lost in the Middle" phenomenon discussed in this article is observed in LLMs, including Claude. However, its universality and severity vary depending on the situation.
"Long context length does not necessarily mean Lost in the Middle will occur."
These countermeasures are merely "temporary responses to Lost in the Middle occurring in personal development" and do not provide a fundamental solution for LLMs in general.
True solutions require prompt engineering design philosophy and task splitting/reference control mechanisms.
Introduction
CLAUDE.md Specifications
When introducing Claude into a development workflow, using CLAUDE.md to have rules loaded at the beginning offers the following benefits:
- ✅ Automatically reference project-specific rules and workflows
- ✅ Explicitly state granular policies and approval flows in advance
- ✅ Encourage "consistent judgment" immediately after startup
📚 Reference: Claude.md Specifications and Memory Management
Furthermore, by linking detailed documentation from CLAUDE.md, it is possible to load granular rules as well.
The Problem of Bloated Context
However, if you include too much, don't you encounter behavior where it forgets the settings or premises you instructed?
This is caused by a problem unique to LLMs known as Lost in the Middle.
In small-scale projects or early stages, the volume of information is low and "the middle section is thin," so the impact of Lost in the Middle is not very visible.
However, as CLAUDE.md and linked documentation grow, important information becomes buried in the middle, easily ignored, and leads to malfunctions.
Chapter 1: The Reality of the Problem
In my personal development environment, the following "runaway behaviors" were repeatedly observed:
- Unexpected Execution: Executing scripts that were relegated to deprecated
- Forgotten PR Request: Ignoring the request for a Pull Request and only pushing
- Misunderstanding Instructions: Arbitrarily changing the specified "Strategy B" to a different approach
- Discrepancy in Completion Reports: Reporting "100% complete" when it is actually incomplete
- Unauthorized Execution: Only permitted to create Google Sheets entries → performed cleanup on its own
-
Repetitive Forbidden Actions: Repeating the forbidden
git add -A
👉 The common essence is that "Claude's judgment of efficiency takes precedence over user instructions."
Chapter 2: Quantitative Evidence
Semantic Overlap Analysis
To confirm the cause of the problem, I investigated the semantic overlap of CLAUDE.md and the linked documentation.
- Semantic Overlap = "The degree to which the same meaning or concept is redundantly described in multiple files."
- The higher the degree of overlap, the harder it is for the LLM to identify the correct source of information, and inference accuracy decreases.
| Category | Number of Files | Overlap | Impact |
|---|---|---|---|
| Approval Related | 21 | 43% | Direct cause of policy violations |
| Version Control | 33 | 68% | Induces indecision |
| Workflow Related | 48 | 99% | Overlap in almost all files |
- Total 97 files / Estimated over 100,000 tokens
- For many models, accuracy tends to decrease at this scale, and important information tends to be overlooked.
📸 Figure: Overlap Matrix (Example)

📸 Figure: Overlap Distribution Chart (Example)

For example, the types of overlaps are as follows:
📋 Approval-Related Overlap (43% Overlap, 21 Files)
Overlap Sample Example: "Approval"
【docs/workflows/checklists/tracker_workflow_checklist.md】
SOW Approval Acquisition: SOW content approval is mandatory before starting work
Approval 1: SOW/Detailed Plan Approval
【docs/workflows/enhanced_approval_workflow.md】
Approval 1: Plan Approval
Approval 2: Implementation Policy Approval
Approval 3: Test Result Approval
Approval 1: Plan Approval (Duplicate Description)
Approval 2: Implementation Policy Approval (Duplicate Description)
Approval 3: Test Result Approval (Duplicate Description)
Reasons for decreased inference accuracy:
- The same concept of "Approval" is described with different expressions in multiple places
- Claude cannot determine "which approval rule takes precedence"
- It is indistinguishable whether it is approval during Phase transition, SOW approval, or plan approval
🔄 Version Control Overlap (68% Overlap, 33 Files)
Overlap Sample Example: "Version"
【CLAUDE.md】
✅ Update only minor versions: v0.9.1 → v0.9.2 → v0.9.3
❌ Prohibition of middle version updates: v0.9.x → v0.10.0
【README.md】
Update only minor versions (v0.9.1 → v0.9.2)
【CHANGELOG.md】
v0.9.35, v0.9.34, v0.9.32 (Actual version history)
Reasons for decreased inference accuracy:
- Identical versioning rules are scattered across multiple files
- Claude is confused about which information source to trust
- Confusion occurs due to the mix of actual CHANGELOG records and CLAUDE.md rules
🔧 Workflow-Related Overlap (99% Overlap, 48 Files)
Overlap Sample Example: "Workflow"
【docs/workflows/checklists/tracker_workflow_checklist.md】
Improved 13-step, 4-phase workflow
Phase 0.5: Branch validation phase (Required/Independent execution)
Phase 1: Planning/Preparation phase (Steps 0-4)
13-step, 4-phase workflow (duplicate description at the end)
【docs/workflows/enhanced_approval_workflow.md】
Phase 1: Filing/Planning phase
Phase 2: Implementation/Testing phase
Phase 3: CI/Quality workflow phase
【docs_backup_20250903/workflows/enhanced_approval_workflow.md】
(The exact same content exists in the backup)
Reasons for decreased inference accuracy:
- Almost all workflow files describe "13 steps" and "4 phases"
- The same content is duplicated in the backup folder (docs_backup_20250903/)
- Phase numbering is inconsistent (Phase 0.5, Phase 1, Phase 2...)
Chapter 3: Solution Plan
Based on the analysis so far, ideally, several countermeasures are needed, such as "integrating extraction commands," "mandatory PRs," and "introducing execution permission levels." However, the response this time (PR #86, PR #89) was limited to a subset.
Primary Response (PR #86, #89)
- Refreshing and integrating CLAUDE.md and related documentation
- Explicitly indicating "what to do next" in conjunction with UI/UX (to reduce procedure skipping and incorrect reporting)
Fundamental Response (Future Tasks)
- RAG: Retrieve necessary information on demand
- Few-shot: Present good examples to encourage learning
- Summarization: Compress and organize important information
- Context Engineering: Introduce information design philosophy
Furthermore, the research community is exploring the following approaches:
| Technology Name | Overview | Merits/Challenges |
|---|---|---|
| Infini-attention | Combines compressed memory + local attention + linear attention (arXiv) | Retains long-range dependencies while improving computational efficiency. However, there is a risk of information loss during compression. |
| Sparse / Graph-based Attention | Limits attention to local areas or graph structures instead of all tokens (arXiv) | Reduces computational volume. However, there is a possibility of losing long-range dependencies. |
| Squeezed Attention | Skips attention by clustering inputs (ACL Anthology) | Effective for specific use cases. Unsuitable for general purposes. |
| MInference | Accelerates pre-filling with dynamic sparse attention (OpenReview) | Improves speed for large inputs. However, there is a risk of decreased accuracy. |
| SampleAttention | Dynamically selects sparse patterns during execution (ResearchGate) | Reduces processing while maintaining accuracy. However, additional computational costs arise. |
Conclusion and Notes
-
Lost-in-the-Middle is not a universal phenomenon
→ Context length does not necessarily cause issues; the impact varies depending on the model and design. -
PR #86 / #89 this time are emergency measures
→ Accuracy can be improved, but fundamental solutions require information design and prompt engineering.
Things to keep in mind:
- It cannot be concluded that "context length = cause of Lost-in-the-Middle."
- This response is a temporary strengthening measure.
Reference Links
- Pull Request #86 – KIRO-006 Phase 2 Implementation
- Pull Request #89 – CLAUDE.md Integration Response
- List of diff files for Pull Request #86
- Zenn Article: Reflections on Lost in the Middle (External)
- Lost in the Middle Paper (Liu et al., 2023)
- Found in the Middle (Zhang et al., 2024)
- Lost in the Middle in Long-Text Generation (2025)
Discussion