iTranslated by AI

The content below is an AI-generated translation. This is an experimental feature, and may contain errors. View original article
💸

A Guide to Claude Code Cost Management for Those Shocked by Their Bill

に公開

Introduction

My bill from Anthropic for February: $323.63 (approx. 48,500 JPY).

List of Anthropic receipts for February

I was using the Max 20x plan, but I still hit the limit. Because I had enabled "Additional Usage" in the settings and continued using it with recharged credits, this was the result. Even though it was the consequence of repeating article verification and Claude Code experiments, I turned pale when I first saw it.

Additional usage settings screen ($321.02 used, $89.85 balance)

According to official Anthropic statistics, the average daily cost per developer is about $6 (approx. 900 JPY). 90% of users stay under $12 per day. This means the monthly guideline is $100–$200 (approx. 15,000–30,000 JPY).

$323 falls into the "overuse" category. That is precisely why I seriously researched cost management methods. In this article, I will summarize cost management techniques based on official documentation.


Understanding Costs

What are you paying for?

Claude Code's pricing is determined by token consumption. Tokens are units of text that the AI processes.

Factors affecting cost:

Factor Impact
Number of files read Consumes tokens every time a file is read
Length of conversation history Longer conversations accumulate tokens with every request
Query complexity More complex questions use more tokens for inference
Compaction frequency You can suppress consumption by compacting regularly

Consumption even in the background

It is surprisingly little-known, but Claude Code consumes tokens even when you are doing nothing.

When I first found out, I double-checked, thinking, "Wait, I'm not doing anything?"

Background process Content
Conversation summarization A job runs automatically to summarize previous conversations for the --resume feature

However, the consumption is negligible, at less than $0.04 (approx. 6 JPY) per session. There is no need to worry about this.


Grasping the situation with /cost

First, it is important to know how much you are using.

How to use

Enter this in a Claude Code session:

/cost

Information displayed

Item Content
Total cost Total cost of the current session
Total duration (API) Time taken for API processing
Total duration (wall) Actual elapsed time
Total code changes Number of lines added/deleted in code

Checking past usage

You can check your past usage by logging in to the Anthropic Console (https://console.anthropic.com). You can also set spending limits for your workspace.


7 Cost Reduction Techniques

1. Compress conversations with /compact

The most effective way to reduce costs. As conversations grow longer, the tokens sent in each request increase. Use /compact to summarize the conversation and suppress token consumption.

/compact

There is also automatic compaction. This triggers automatically when the context exceeds 95% of the capacity.

You can also specify information to keep by using custom instructions:

/compact Make sure to keep the test results and the list of changed files

2. Reset context with /clear

When moving to an unrelated task, reset the conversation instead of continuing it.

/clear

You might think, "I want to carry over the information from reading the previous file," but leaving unnecessary information in the context worsens both cost and performance.

3. Provide specific instructions

Vague instructions trigger unnecessary file scans, wasting tokens.

❌ "Fix this bug"
✅ "An error occurs in the refreshToken function in src/auth/session.ts. Please fix it."

Just by specifying the file path, you can significantly save tokens that Claude would otherwise spend on exploration.

4. Delegate research to sub-agents

"Research" that involves reading a large number of files is costly. Since sub-agents run in a separate context, they suppress token consumption in the main conversation.

Use a sub-agent to research how the authentication system works

For more details, see Part 5: Leveraging Sub-agents.

5. Prevent wasted implementation with Plan Mode

If you ask for implementation immediately, you may end up heading in the wrong direction and needing to redo it. If you plan ahead in Plan Mode, you can prevent the generation of useless code.

Switch to Plan Mode.
First, read src/auth/ and understand the session management mechanism.

For more details, see Part 4: Prompting Techniques.

6. Control Extended Thinking

When Claude Code faces difficult problems, it may "think" internally. This is Extended Thinking, which can increase costs by up to $0.80 per request.

You can change thinking settings via /config:

Setting Behavior Best for
Automatic (Default) Claude judges as needed Normal use
Off Always disabled When cost is the top priority

For simple file edits or question-answering tasks, turning off Extended Thinking significantly lowers costs.

7. Don't bloat CLAUDE.md

CLAUDE.md is loaded in every single request. This means that for every line you add, the cost of every request increases.

Ask yourself about every line: "If I delete this, will Claude make a mistake?"

If the answer is No, that line is unnecessary.


Max Plan vs API: Which to Choose?

Claude Max 5x Claude Max 20x API (Pay-as-you-go)
Pricing Model $100/month $200/month Pay for what you use
Usage Limit 5x base 20x base Set your own spend limit
Upon Reaching Limit Pause or additional usage Pause or additional usage Stop upon hitting limit
Cost Control Watch for overages Watch for overages Monitor via /cost
Best for Daily users Power users Occasional users, teams

Decision Criteria:

  • Use it for a few hours every day → Max Plan (but be mindful of overages)
  • Use it a few times a week, for short periods → API might be cheaper
  • Use it with a team → API (workspace management and spend limits are available)

Habits for Reducing Costs

Habit Effect
/clear when task changes Prevents accumulation of unnecessary context
/compact during long conversations Suppresses increase in token consumption
Explicitly specify file paths Reduces exploration costs
Use "sub-agents" for research Protects main context
Keep CLAUDE.md short Lowers base cost for all requests
/clear after 2 failures Prevents accumulation of failure context
Turn off Extended Thinking for simple tasks Saves up to $0.80 per request

Summary

Point Content
Average cost Approx. $6/day ($100–$200/month)
Most important techniques /compact, /clear, and Extended Thinking control
Fundamental principle Context management = Cost management
Monitoring method /cost command
Plan selection Even with Max, watch for overages. Cost management is necessary for all plans

Keep in mind:

Context management is directly linked not only to performance but also to cost.
/clear and /compact are the most cost-effective commands.

For those wanting further optimization: I explain advanced techniques such as hidden costs of MCP servers, 98% token reduction using Skills, and log filtering via Hooks in the Advanced Edition.


References


GitHubで編集を提案

Discussion