iTranslated by AI
A Guide to Claude Code Cost Management for Those Shocked by Their Bill
Introduction
My bill from Anthropic for February: $323.63 (approx. 48,500 JPY).

I was using the Max 20x plan, but I still hit the limit. Because I had enabled "Additional Usage" in the settings and continued using it with recharged credits, this was the result. Even though it was the consequence of repeating article verification and Claude Code experiments, I turned pale when I first saw it.

According to official Anthropic statistics, the average daily cost per developer is about $6 (approx. 900 JPY). 90% of users stay under $12 per day. This means the monthly guideline is $100–$200 (approx. 15,000–30,000 JPY).
$323 falls into the "overuse" category. That is precisely why I seriously researched cost management methods. In this article, I will summarize cost management techniques based on official documentation.
Understanding Costs
What are you paying for?
Claude Code's pricing is determined by token consumption. Tokens are units of text that the AI processes.
Factors affecting cost:
| Factor | Impact |
|---|---|
| Number of files read | Consumes tokens every time a file is read |
| Length of conversation history | Longer conversations accumulate tokens with every request |
| Query complexity | More complex questions use more tokens for inference |
| Compaction frequency | You can suppress consumption by compacting regularly |
Consumption even in the background
It is surprisingly little-known, but Claude Code consumes tokens even when you are doing nothing.
When I first found out, I double-checked, thinking, "Wait, I'm not doing anything?"
| Background process | Content |
|---|---|
| Conversation summarization | A job runs automatically to summarize previous conversations for the --resume feature |
However, the consumption is negligible, at less than $0.04 (approx. 6 JPY) per session. There is no need to worry about this.
Grasping the situation with /cost
First, it is important to know how much you are using.
How to use
Enter this in a Claude Code session:
/cost
Information displayed
| Item | Content |
|---|---|
| Total cost | Total cost of the current session |
| Total duration (API) | Time taken for API processing |
| Total duration (wall) | Actual elapsed time |
| Total code changes | Number of lines added/deleted in code |
Checking past usage
You can check your past usage by logging in to the Anthropic Console (https://console.anthropic.com). You can also set spending limits for your workspace.
7 Cost Reduction Techniques
1. Compress conversations with /compact
The most effective way to reduce costs. As conversations grow longer, the tokens sent in each request increase. Use /compact to summarize the conversation and suppress token consumption.
/compact
There is also automatic compaction. This triggers automatically when the context exceeds 95% of the capacity.
You can also specify information to keep by using custom instructions:
/compact Make sure to keep the test results and the list of changed files
2. Reset context with /clear
When moving to an unrelated task, reset the conversation instead of continuing it.
/clear
You might think, "I want to carry over the information from reading the previous file," but leaving unnecessary information in the context worsens both cost and performance.
3. Provide specific instructions
Vague instructions trigger unnecessary file scans, wasting tokens.
❌ "Fix this bug"
✅ "An error occurs in the refreshToken function in src/auth/session.ts. Please fix it."
Just by specifying the file path, you can significantly save tokens that Claude would otherwise spend on exploration.
4. Delegate research to sub-agents
"Research" that involves reading a large number of files is costly. Since sub-agents run in a separate context, they suppress token consumption in the main conversation.
Use a sub-agent to research how the authentication system works
For more details, see Part 5: Leveraging Sub-agents.
5. Prevent wasted implementation with Plan Mode
If you ask for implementation immediately, you may end up heading in the wrong direction and needing to redo it. If you plan ahead in Plan Mode, you can prevent the generation of useless code.
Switch to Plan Mode.
First, read src/auth/ and understand the session management mechanism.
For more details, see Part 4: Prompting Techniques.
6. Control Extended Thinking
When Claude Code faces difficult problems, it may "think" internally. This is Extended Thinking, which can increase costs by up to $0.80 per request.
You can change thinking settings via /config:
| Setting | Behavior | Best for |
|---|---|---|
| Automatic (Default) | Claude judges as needed | Normal use |
| Off | Always disabled | When cost is the top priority |
For simple file edits or question-answering tasks, turning off Extended Thinking significantly lowers costs.
7. Don't bloat CLAUDE.md
CLAUDE.md is loaded in every single request. This means that for every line you add, the cost of every request increases.
Ask yourself about every line: "If I delete this, will Claude make a mistake?"
If the answer is No, that line is unnecessary.
Max Plan vs API: Which to Choose?
| Claude Max 5x | Claude Max 20x | API (Pay-as-you-go) | |
|---|---|---|---|
| Pricing Model | $100/month | $200/month | Pay for what you use |
| Usage Limit | 5x base | 20x base | Set your own spend limit |
| Upon Reaching Limit | Pause or additional usage | Pause or additional usage | Stop upon hitting limit |
| Cost Control | Watch for overages | Watch for overages | Monitor via /cost
|
| Best for | Daily users | Power users | Occasional users, teams |
Decision Criteria:
- Use it for a few hours every day → Max Plan (but be mindful of overages)
- Use it a few times a week, for short periods → API might be cheaper
- Use it with a team → API (workspace management and spend limits are available)
Habits for Reducing Costs
| Habit | Effect |
|---|---|
/clear when task changes |
Prevents accumulation of unnecessary context |
/compact during long conversations |
Suppresses increase in token consumption |
| Explicitly specify file paths | Reduces exploration costs |
| Use "sub-agents" for research | Protects main context |
| Keep CLAUDE.md short | Lowers base cost for all requests |
/clear after 2 failures |
Prevents accumulation of failure context |
| Turn off Extended Thinking for simple tasks | Saves up to $0.80 per request |
Summary
| Point | Content |
|---|---|
| Average cost | Approx. $6/day ($100–$200/month) |
| Most important techniques |
/compact, /clear, and Extended Thinking control |
| Fundamental principle | Context management = Cost management |
| Monitoring method |
/cost command |
| Plan selection | Even with Max, watch for overages. Cost management is necessary for all plans |
Keep in mind:
Context management is directly linked not only to performance but also to cost.
/clearand/compactare the most cost-effective commands.
For those wanting further optimization: I explain advanced techniques such as hidden costs of MCP servers, 98% token reduction using Skills, and log filtering via Hooks in the Advanced Edition.
Discussion