Unified Reasoning Configuration

New unified reasoning object for precise control over reasoning models. Specify exact token budgets with max_tokens or use effort levels — all in one consistent API.

Unified reasoning configuration

Unified Reasoning Configuration

We've added a new reasoning configuration object that gives you flexible control over reasoning-capable models. You can now specify reasoning behavior in a consistent, unified way.

Option 1: Reasoning Effort

Use reasoning.effort to control reasoning intensity:

1curl -X POST https://api.llmgateway.io/v1/chat/completions \
2 -H "Authorization: Bearer $LLM_GATEWAY_API_KEY" \
3 -H "Content-Type: application/json" \
4 -d '{
5 "model": "anthropic/claude-sonnet-4-20250514",
6 "messages": [{"role": "user", "content": "Explain quantum entanglement"}],
7 "reasoning": {
8 "effort": "high"
9 }
10 }'

Supported effort levels: none, minimal, low, medium, high, xhigh

Option 2: Exact Token Budget

Use reasoning.max_tokens for precise control over reasoning token allocation:

1curl -X POST https://api.llmgateway.io/v1/chat/completions \
2 -H "Authorization: Bearer $LLM_GATEWAY_API_KEY" \
3 -H "Content-Type: application/json" \
4 -d '{
5 "model": "anthropic/claude-sonnet-4-20250514",
6 "messages": [{"role": "user", "content": "Explain quantum entanglement"}],
7 "reasoning": {
8 "max_tokens": 8000
9 }
10 }'

When max_tokens is specified, it takes precedence over effort.


Supported Models

The reasoning.max_tokens parameter works with:

  • Anthropic Claude — Claude 3.7 Sonnet, Claude Sonnet 4, Claude Opus 4, Claude Opus 4.5
  • Google Gemini — Gemini 2.5 Pro, Gemini 2.5 Flash, Gemini 3 Pro Preview

Auto-Routing Support

When using auto-routing (e.g., claude-sonnet-4 without provider prefix) or root models with reasoning.max_tokens, the gateway automatically routes only to providers that support explicit reasoning token budgets.


Provider Constraints

  • Anthropic: Budget must be between 1,024 and 128,000 tokens (values are automatically clamped)
  • Google: No specific constraints

Read the docs for more details.