Anthropic released Claude Opus 4.7 on April 16, 2026, and the headline number is the 1M-token context window — five times the 200k that Opus 4.6 shipped with (Anthropic, What's new in Claude Opus 4.7). It is generally available through the Claude API, Amazon Bedrock, Google Cloud Vertex AI, and Microsoft Foundry (AWS, Introducing Claude Opus 4.7 in Amazon Bedrock).
Pricing stayed put: $5 per million input tokens, $25 per million output tokens, with no long-context premium charged at the 200k+ threshold (Anthropic platform docs). That is the part most freelance developers underrate. Long-context windows on competitor models have historically come with 2x or 3x token-price multipliers at the upper tier. Opus 4.7 holds the 4.6 price across the full 1M.
The freelance engineering question is not "is 1M tokens cool" — it obviously is. The question is which workflows actually change as a result, and which ones do not.
What 1M tokens fits in practice
A useful mental conversion: 1M tokens is roughly 750,000 words, or 1,500 single-spaced pages of plain text. In code terms, that is a mid-size codebase. Most consulting engagements — a 50-file Next.js app, a 300-file Django monolith, a 200-file React Native client — fit inside Opus 4.7's context window in their entirety, including dependencies' type definitions and the relevant chunks of generated Prisma client code.
That fundamentally changes the agent loop. With a 200k window, an agent working on a 100k-LOC codebase has to swap context constantly — read a few files, summarize, drop, read more. The summarization step is where bugs come from. With a 1M window, the agent can hold the whole thing and reason over it.
The Anthropic platform docs note that SWE-bench Pro performance jumped from 53.4% on Opus 4.6 to 64.3% on Opus 4.7 — a 10.9-point gain in a single version bump (Anthropic). Some of that is the larger context. Some is the new tokenizer (which uses 1x-1.35x more tokens per character but apparently buys real capability). Some is the model itself getting better. The point is that the gain is real and shows up in benchmarks, not just in marketing copy.
What changes in a freelance dev workflow
Five concrete shifts we are seeing in May 2026, two months in:
1. The "summarize and lose context" failure mode is mostly dead. A common pattern in 2025 agentic workflows was: agent reads files, agent summarizes, agent works from the summary, agent forgets the exact function signature it was supposed to call. With 1M tokens, the summarize step is optional for codebases under the threshold. The agent works against the source directly.
2. The xhigh effort level changes pricing math at the top. Opus 4.7 adds a new xhigh effort level between high and max, and it is the new default for Claude Code (Anthropic). For a freelance engineer running long-horizon agentic loops, the choice of effort level is now a deliberate trade-off across four tiers, not three. The honest practical guidance: stay on xhigh for client work unless you are running cheap exploratory loops.
3. Task budgets give you a per-loop estimate. Opus 4.7 introduces task budgets, an advisory token allowance that the model sees as a countdown during the agentic loop (Anthropic). This is not a hard cap (that is still max_tokens) — it is a soft signal that lets the model self-moderate. For freelance work where client billing is tied to AI cost, this is the first time you can credibly say "this task gets X budget" and have the model actually pace itself.
4. The tokenizer change is a hidden cost. Opus 4.7 uses a new tokenizer that runs 1x to 1.35x as many tokens per character as 4.6 (Anthropic). Same prompt, sometimes 35% more tokens, same price per token. That is a meaningful cost increase you should bake into estimates before quoting a client a fixed-price AI-assisted engagement. If you were running 4.6 at $10 per build, 4.7 might be $12-13.50.
5. Vision actually works now. Opus 4.7 supports 2576px / 3.75MP images at 1:1 coordinate mapping (Anthropic; Caylent, Claude Opus 4.7 deep dive). For freelancers doing UI work — taking Figma screenshots, reading client-supplied wireframes, scrolling generated screenshots back to the model for verification — the resolution upgrade is meaningful. Pixel-level pointing and bounding-box localization improved noticeably.
What does not change
A few realities the 1M context window does not fix:
- Latency. A 1M-token prompt is slower than a 200k-token prompt. For interactive workflows where the developer is waiting on the model, you still want to be careful about how much you stuff into a prompt. The win is not "always send everything"; it is "send everything when you need to."
- Output limits. Max output is still 128k tokens per turn (Anthropic). A 1M-token context with a 128k output ceiling means you cannot do a one-shot "rewrite this entire 500k-LOC codebase" — you still need an agentic loop with multiple turns.
- Cost on very large codebases. A million input tokens is
$5. That is fine for a single run. A consulting engagement that runs 50 such loops per week is$1,000/month in input alone before output. The freelance economics still demand sensible prompt design. - Effective reasoning over long contexts. The model can hold 1M tokens. Whether it reasons equally well across all 1M is a separate question. Real-world benchmarks (including "needle in a haystack" recall) show Opus 4.7 holds up well, but performance gradients across context length are real. Putting the critical files at the start and end of the context, with the noise in the middle, still helps.
The breaking changes you will hit
Three Opus 4.7 API changes will bite teams migrating from 4.6:
- Extended thinking budgets are removed. Setting
thinking: {"type": "enabled", "budget_tokens": N}now returns a 400 error. Adaptive thinking is the only thinking-on mode (Anthropic). - Sampling parameters are removed. Setting
temperature,top_p, ortop_kreturns 400. The recommended migration is to omit those parameters entirely and use prompting to guide behavior. - Thinking content is omitted by default. If your product was streaming Claude's reasoning to users, set
"display": "summarized"to restore the old behavior.
For freelancers maintaining client integrations, these are not catastrophic but they are mandatory. The Anthropic migration guide is the authoritative reference.
Where Opus 4.7 sits in the 2026 model menu
A rough ranking, with caveats:
- Opus 4.7 is the best general-purpose coding agent in May 2026. SWE-bench Pro 64.3%. Long-horizon agentic work and knowledge tasks. Expensive but consistently capable.
- GPT-5 Codex / o3-pro are competitive on raw single-shot coding benchmarks but trail on agentic loops where the model has to maintain state across many turns.
- Gemini 2.5 Pro has a long context window of its own and is cheaper, but its agentic tool-use loop is still rougher than Anthropic's at the time of writing.
- Claude Sonnet 4.5 is the price-performance sweet spot for most freelance work where you do not need the full Opus reasoning depth.
For freelance engineers, the practical question is rarely "which model is best in isolation" — it is "which model wins in my actual harness, on my actual codebase, against my actual client SLAs." Opus 4.7 wins more of those tests than any other model in May 2026 for complex multi-file agentic work. It loses some of them to Sonnet 4.5 on cost.
Related readCursor 2.0 Composer vs GitHub Copilot Agent: Honest 2026 Verdict for Freelance DevsThe takeaway
The 1M context window is the right kind of unsexy upgrade — same price, more capability, fewer awkward workarounds. For freelance engineers running agentic loops on client codebases, the practical changes are: less summarization scaffolding, more direct source reading, a new effort tier to think about, and an advisory task-budget mechanism that finally gives clients a credible cost-cap conversation.
Opus 4.7 is not the inflection point. It is the model that quietly removes the last few reasons to keep your agent loops short.
Delivvo gives freelance engineers a single branded portal for proposals, contracts, file delivery, and invoices — so when the client asks how many Opus 4.7 agentic loops fit in the retainer, the scope, deliverables, and per-loop budget are already documented in one client URL. See how it works →
Written by The Delivvo team · May 12, 2026
More from the blog →