Claude Code now defaults to a 1 million token context window on Opus 4.6 and Sonnet 4.6, with no opt-in required and no pricing surcharge. This gives developers 5-10x more working memory than Cursor, Copilot, or Windsurf, and it changes how long coding sessions work.
TL;DR
- What changed: On March 13, 2026, Anthropic made the 1M context window generally available (GA) for Opus 4.6 and Sonnet 4.6. Previously it was 200K tokens by default and required beta headers or extra usage credits to go beyond that.
- Pricing: The long-context surcharge (2x input, 1.5x output for tokens beyond 200K) is gone. Flat rate pricing now applies regardless of context length.
- Who gets it: Max, Team, and Enterprise plan users on Claude Code. No extra purchase needed.
- Practical impact: 15% fewer compaction events, ~75,000 lines of code in a single session, 600 images/PDFs per request (up from 100).
- The catch: A bigger window does not mean you should ignore context management. Use
/compact at 40% usage, and always maintain a solid CLAUDE.md file.
What Actually Changed on March 13?
Anthropic moved the 1 million token context window from beta to general availability across three layers simultaneously. The API no longer requires the context-1m-2025-08-07 beta header or Tier 4 status. The pricing surcharge for tokens beyond 200K is eliminated entirely. And Claude Code now defaults to the full 1M window for Max, Team, and Enterprise users without needing "extra usage" credits.
This was the final step in a gradual rollout. Claude Code v2.1.50 first gave Opus 4.6 fast mode access to 1M tokens. Version 2.1.73 made Opus 4.6 the default model on Bedrock, Vertex, and Microsoft Foundry. And v2.1.75 (March 13) removed the last gate: the extra usage requirement.
If you want to opt out and stick with the 200K window, set the environment variable CLAUDE_CODE_DISABLE_1M_CONTEXT=true.
Sources: Anthropic 1M Context GA Blog Post | Claude Code Changelog
How Does This Compare to Other AI Coding Tools?
The 1 million token context window is the largest default among dedicated coding assistants. Here is how it stacks up against the competition.
ToolEffective Context WindowNotesClaude Code (Opus 4.6)1,000,000 tokensDefault on Max/Team/Enterprise. No surcharge.Cursor~120K-200K tokensSupports models up to 200K but effective usable context is lowerGitHub Copilot~128K tokensDraws from open files, recent files, repo structureWindsurf~100K tokensSession-level context tracking with codebase awarenessGemini (via API)1M-2M tokensHas had 1M+ for longer, but variable quality at rangeGPT-5.4 (OpenAI)1,000,000 tokensRecently launched with 1M, but loses 54% retrieval accuracy at scale
Raw numbers only tell part of the story. Opus 4.6 scores 78.3% on MRCR v2 at 1M tokens, which Anthropic says is the highest among frontier models. GPT-5.4 reportedly loses 54% of its retrieval accuracy scaling from 256K to 1M. In other words: it is not just about how much you can fit in the window. It is about whether the model can actually use what is in there.
IDE-based tools like Cursor and Copilot compensate with semantic indexing and vector search over repositories. That approach can be more efficient for targeted lookups. But for long autonomous sessions (multi-hour CI loops, large refactors, complex debugging chains), raw context capacity matters. Having the model remember your decisions from 30 minutes ago without needing to re-explain them is a real workflow improvement.
What Does 1 Million Tokens Actually Look Like in Practice?
One million tokens translates to roughly 75,000 lines of code, or hundreds of documents loaded into a single session. Anthropic's own data shows a 15% decrease in compaction events across real Claude Code usage since the change.
That means fewer moments where the model suddenly "forgets" the architecture decision you made at the start of the session. Fewer times where you need to re-explain what you are building. And for media-heavy workflows, the limit jumped from 100 to 600 images or PDF pages per request.
Practically, this benefits three types of workflows the most:
- Long autonomous agent runs where Claude Code iterates on CI failures, runs tests, and fixes issues across multiple files over extended periods
- Large codebase refactors where you need the model to hold awareness of how changes in one file affect dozens of others
- Multi-system debugging where you are jumping between logs, config files, test output, and source code in a single session
For most daily tasks (writing a function, fixing a bug, generating a component), you will not come close to 1M tokens. Many developers report staying under 100K in typical sessions. The value is in removing the ceiling so you never hit it during the sessions that matter most.
Related reading: How to build a 99% SEO website in 12 minutes with Claude Code | Claude Code memory for marketing and SEO
Why You Still Need to Manage Your Context Window
A 1 million token context window does not mean you should treat it like an unlimited buffer. This is the part most people will get wrong.
Here is the reality: even with 1M tokens available, model performance can degrade as context fills up. More tokens means more noise for the model to sort through. Important instructions from early in the session can get diluted by thousands of lines of tool output, file reads, and intermediate steps. The model is not losing the information. It is just competing with more information for attention.
The 40% rule: When your context window reaches approximately 40% usage, run the /compact command. This is not a panic button. It is routine maintenance. When you run /compact, give it clear instructions on what to preserve. Tell it what decisions were made, what files were modified, and what the next steps are. A good compact instruction looks like this:
/compact Preserve: we are refactoring the auth middleware in src/auth/. Files modified so far: middleware.ts, session.ts, types.ts. Decision: using JWT with refresh tokens instead of session cookies. Next: update the route handlers in src/routes/ to use the new middleware.
Without those instructions, compaction will summarize generically, and you will lose the specifics that matter.
The CLAUDE.md advantage: No matter how large your context window is, a well-maintained CLAUDE.md file will always outperform relying on context alone. Your CLAUDE.md loads at the start of every session and every compaction. It is the one thing that persists no matter what.
A good CLAUDE.md contains:
- Project structure and key file locations
- Coding conventions and patterns used in the project
- Common commands (test, build, deploy)
- Architecture decisions and their rationale
- Things the model should never do (destructive commands, specific patterns to avoid)
Think of it this way: the 1M context window is your short-term memory. CLAUDE.md is your long-term memory. You need both. The context window handles the current session. CLAUDE.md handles everything that should survive across sessions.
Related reading: Build the perfect SEO copywriter with Claude Skills | Replace Zapier and n8n with Claude Code cron jobs
What Does the Pricing Change Mean for Your Budget?
The pricing change might be more significant than the context window itself. During the beta period, using more than 200K tokens meant paying 2x on input tokens and 1.5x on output tokens. For heavy users running long agent sessions, this added up fast.
Now Opus 4.6 stays at $5 per million input tokens and $25 per million output tokens, regardless of whether you use 50K or 950K of the window. Sonnet 4.6 stays at $3/$15. A 900K-token request costs the same per-token rate as a 9K-token request.
For Claude Code Max/Team/Enterprise subscribers, this is even simpler: the 1M window is included in your plan. No extra usage credits needed. You pay your subscription and use the full window.
The Cursor community immediately noticed this change and asked when it would cascade to their pricing, since Cursor pays Anthropic's API rates on behalf of users.
Who Should Care About This Update?
This update matters most if you fall into one of these categories. If you use Claude Code for anything beyond quick one-off tasks, the larger context window removes friction you may not have even noticed.
- Solo developers working on full-stack projects where you jump between frontend, backend, database, and deployment config in a single session
- Teams using Claude Code for code review where the model needs to hold the full PR diff plus surrounding context
- Anyone running autonomous agent workflows (Claude Code with cron jobs, CI pipelines, or monitoring scripts)
- Content creators and marketers who use Claude Code for SEO automation, batch content generation, or Google Workspace integrations
- API developers who previously had to manage the beta header and Tier 4 requirement
If you are already on Max, Team, or Enterprise, you do not need to do anything. The 1M window is already active. Check it by looking at the model identifier in your Claude Code status line. It should show "Opus 4.6 (1M context)."
Related reading: How to connect Claude to SEO data | What is Model Context Protocol
FAQ
Does the 1M context window cost extra on Claude Code?
No. As of March 13, 2026, the 1 million token context window is included by default for Max, Team, and Enterprise plan users. The previous "extra usage" requirement and the long-context pricing surcharge have both been removed. You pay your normal subscription rate.
Should I let my context window fill up to 1M tokens before compacting?
No. Run /compact when you reach approximately 40% of your context window. Larger context means more noise for the model to sort through, which can reduce the quality of responses. When compacting, provide specific instructions about what to preserve: files modified, decisions made, and next steps. This keeps your session focused and productive.
What is a CLAUDE.md file and why does it matter with a larger context window?
A CLAUDE.md file is a markdown file in your project root that Claude Code reads at the start of every session. It contains project structure, coding conventions, key commands, and architecture decisions. Even with 1M tokens of context, CLAUDE.md acts as persistent long-term memory that survives compaction and session restarts. The context window is short-term memory. CLAUDE.md is long-term memory. You need both.
How does Claude Code's 1M context compare to Cursor or GitHub Copilot?
Claude Code's 1M token context window is roughly 5-10x larger than Cursor (~120-200K), GitHub Copilot (~128K), and Windsurf (~100K). More importantly, Opus 4.6 maintains 78.3% retrieval accuracy at 1M tokens, which is the highest among current frontier models. Competing tools compensate with semantic indexing and vector search, which works well for targeted lookups but not for maintaining session-long awareness during complex multi-file tasks.
Can I opt out of the 1M context window and use the old 200K default?
Yes. Set the environment variable CLAUDE_CODE_DISABLE_1M_CONTEXT=true to revert to the 200K context window. Some developers prefer this for faster response times or tighter cost control on API usage. The opt-out is per-session, so you can switch between them as needed.
Bottom Line
The 1M context window going default is a meaningful upgrade, not because most sessions need a million tokens, but because it removes the ceiling for the sessions that do. Combined with the pricing surcharge elimination, it makes Claude Code significantly more practical for long, complex coding sessions.
But the real takeaway is this: context window size is a tool, not a strategy. The developers who get the most out of Claude Code are not the ones with the biggest context windows. They are the ones who maintain clean CLAUDE.md files, compact proactively at 40%, and structure their sessions with clear intent.
A well-organized 200K session will outperform a chaotic 1M session every time. The 1M window just means that when you do need the space, it is there.
Want to learn how to use AI tools like Claude Code for SEO and content? Join the AI Ranking community where we teach practical AI-powered SEO workflows every week.
Sources: Anthropic 1M Context GA Announcement | Claude Code Changelog v2.1.75 | Anthropic API Docs: Context Windows | The Decoder: Anthropic Drops Surcharge | Simon Willison's Coverage