One of the three features launching with Claude Opus 4.8 is effort control — a new setting on claude.ai and Cowork (and xhigh/max in Claude Code) that lets you decide how much effort Claude puts into a response. It sits right next to the model selector, and it's available on all plans. On higher effort settings, Claude thinks more frequently and more deeply for better responses. On lower effort settings, Claude responds faster and uses your rate limits more slowly. It's a simple control with real consequences for quality, speed, and cost — and most users won't know which setting to pick.
This guide explains exactly when to use each effort level, how it affects your token usage and rate limits, and which tasks justify spending more effort versus which are fine on the default. Get this right and you'll get dramatically better results on hard problems while conserving your usage on easy ones.
Key Takeaway
Opus 4.8 defaults to "high" effort — the best balance of quality and experience, spending similar tokens to Opus 4.7's default but with better results. Use "extra" (xhigh in Claude Code) for difficult tasks and long-running asynchronous workflows. Use "max" for the hardest problems where you want maximum quality regardless of token cost. Lower effort responds faster and conserves rate limits for simple tasks. Higher effort uses more tokens, so Anthropic raised Claude Code rate limits to accommodate it.
What Effort Control Actually Does
Effort control adjusts how much "thinking" Claude does before and during its response. At higher effort, Claude reasons more frequently and more deeply — exploring more of the problem, considering more angles, and checking its own work more thoroughly. This produces better answers on complex tasks but consumes more tokens and time. At lower effort, Claude responds more directly and quickly, which is ideal for straightforward tasks where deep reasoning would be overkill and would just waste tokens and rate limits.
Opus 4.8 defaults to "high" effort, which Anthropic judges to be the best overall balance of quality and user experience. Importantly, on coding tasks, this default high effort spends a similar number of tokens as Opus 4.7's default — but delivers better performance. So the default isn't more expensive than what you were already using; it's just better. Above the default, you have "extra" (called xhigh in Claude Code) and "max," which spend progressively more tokens for progressively better results on hard problems.
When to Use Each Effort Level
Default (high): Leave it here for most work. It's the balanced setting that handles the majority of tasks well — general questions, standard coding, writing, analysis, and everyday use. You don't need to touch the control for routine work; the default is tuned to be the right choice most of the time.
Extra (xhigh in Claude Code): Anthropic specifically recommends this for difficult tasks and long-running asynchronous workflows. If you're handing Claude a complex coding problem, a multi-step analysis, or an agentic task that will run unattended for a while, bump it to extra. The additional thinking pays off on problems where the first answer isn't likely to be the best answer. This is the setting for "this is hard and I want it done right."
Max: Reserve this for the hardest problems where quality matters more than token cost — complex architectural decisions, intricate debugging, high-stakes analysis, or any task where you'd rather spend more tokens than risk a suboptimal answer. Max effort uses the most tokens, so it's not the setting for routine work, but for genuinely difficult problems it extracts the most from the model.
Lower effort: Drop below default for simple, high-volume tasks where speed and rate-limit conservation matter more than depth — quick lookups, simple rewrites, routine formatting, or when you're working through many small tasks and want to preserve your usage. Lower effort responds faster and uses your rate limits more slowly.
📬 Getting value from this?
One actionable AI insight per week. Plus a free prompt pack when you subscribe.
Subscribe free →Effort Level Quick Reference
| Effort Level | Best For | Token Use |
|---|---|---|
| Lower | Simple, high-volume tasks; quick lookups | Lowest |
| High (default) | Most everyday tasks — balanced | Moderate |
| Extra (xhigh) | Difficult tasks, long-running async work | High |
| Max | Hardest problems, quality over cost | Highest |
One practical note: effort level and prompt quality work together. A high-effort setting can't fully compensate for a vague prompt, and a great prompt on default effort often beats a mediocre prompt on max effort. The free Prompt Optimizer sharpens your prompt so you get the best result at whatever effort level you choose, and TresPrompt brings that optimization into your Claude sidebar. For the full picture of what's new in this release, see our Opus 4.8 overview.
📬 Want more like this?
One actionable AI insight per week. Plus a free prompt pack when you subscribe.
Subscribe free →Effort Control vs Prompt Quality: Which Matters More?
A common misconception is that cranking effort to max is a substitute for writing a good prompt. It isn't. Effort control adjusts how much the model thinks, but it can't compensate for instructions that are vague, ambiguous, or missing key context. If you ask a poorly-specified question at max effort, you'll get a thoroughly-reasoned answer to the wrong question. The model will think hard — about the wrong thing. Effort and prompt quality are complementary, not interchangeable: prompt quality determines whether the model understands what you want, while effort determines how thoroughly it pursues it.
In practice, the highest-leverage move is usually improving your prompt before touching the effort control. A clear, specific, well-structured prompt at default effort frequently beats a vague prompt at max effort — and costs far fewer tokens. Only after you've nailed the prompt does bumping the effort level pay off, by giving the model room to thoroughly work through a well-understood problem. Think of it as a sequence: first make sure the model knows exactly what you want (prompt quality), then decide how hard it should work on it (effort level).
Effort Control in Long-Running and Async Workflows
Effort control becomes especially valuable in long-running and asynchronous workflows, which is exactly where Anthropic recommends the "extra" setting. When you hand Claude a task that will run unattended — an agentic workflow, a complex multi-step analysis, a long coding task — you're not sitting there waiting for each token, so the speed penalty of higher effort doesn't hurt your experience. Meanwhile, the quality benefit is amplified because the task is complex enough that thorough reasoning meaningfully improves the outcome. Async work is the ideal case for higher effort: you get the quality gain without feeling the speed cost.
The inverse applies to interactive, real-time work. When you're in a back-and-forth conversation iterating quickly, lower or default effort keeps the experience snappy, and you can always bump effort up for the one hard question in the middle of an otherwise simple session. The skill is matching effort to the interaction pattern: high effort for unattended complex work, default for interactive work, lower for rapid simple iterations. Combined with choosing the right model tier, this gives you fine-grained control over the quality-speed-cost tradeoff for every task.
Frequently Asked Questions
What is effort control in Claude Opus 4.8?
Effort control is a new setting (next to the model selector on claude.ai and Cowork, and as xhigh/max in Claude Code) that lets you choose how much Claude thinks before responding. Higher effort means deeper reasoning and better answers but more tokens and time. Lower effort means faster responses that conserve your rate limits. It's available on all plans.
What's the difference between extra and max effort?
Both spend more tokens than the default for better results. "Extra" (xhigh in Claude Code) is recommended for difficult tasks and long-running asynchronous workflows — a strong step up without going to the maximum. "Max" spends the most tokens and is reserved for the hardest problems where you want maximum quality regardless of cost. For most hard tasks, extra is sufficient; max is for the genuinely difficult cases.
Does higher effort cost more?
Higher effort uses more tokens, which means higher cost per response and faster rate-limit consumption. However, Opus 4.8's default high effort spends similar tokens to Opus 4.7's default on coding tasks while delivering better results, so the default isn't more expensive than before. Anthropic raised Claude Code rate limits to accommodate higher effort levels.
Which effort level should I use by default?
Leave it on the default (high) for most work — it's tuned to be the best balance for the majority of tasks. Only bump it up for genuinely difficult problems or long-running work, and only drop it down for simple, high-volume tasks where you want speed and rate-limit conservation.
Is effort control available on all plans?
Yes — Anthropic made the effort control available on all plans for claude.ai and Cowork. In Claude Code, the equivalent settings are xhigh and max. This is one of the few Opus 4.8 launch features available across all tiers (unlike dynamic workflows, which is limited to Max, Team, and Enterprise).
Disclosure: Some links in this article are affiliate links. We only recommend tools we've personally tested and use regularly. See our full disclosure policy.