Integrating with Claude Code¶
Claude Code is
Anthropic's official CLI. It speaks the Anthropic Messages API, so it
can be pointed at any Anthropic-compatible endpoint, including
Continuum Router, via the ANTHROPIC_BASE_URL environment variable.
This guide walks through making Claude Code fully functional against a self-hosted LLM (vLLM, Ollama, llama.cpp, LM Studio, MLxcel, or any other OpenAI-compatible backend) served through Continuum Router, including the two features Claude Code ships that assume Anthropic's native backend: WebFetch and WebSearch.
What Claude Code expects¶
Claude Code is built for Anthropic's hosted API. Out of the box it hard-codes a handful of behaviors that need router-side translation when the upstream is a self-hosted model:
- It talks to
/v1/messages(Anthropic Messages format), not/v1/chat/completions. - It issues intermediate calls to a small haiku-class model
(
claude-haiku-4-5-20251001at the time of writing) for WebFetch's summarizer and WebSearch's query extractor. - Its WebSearch feature sends Anthropic's native server tool
(
web_search_20250305) and expects the response to includeserver_tool_use+web_search_tool_resultcontent blocks.
Continuum Router handles all three transparently when its
web_search and model_aliases sections are configured; see
Configure the router below.
Prerequisites¶
Before starting:
- Continuum Router installed (see the Installation page).
- At least one self-hosted, OpenAI-compatible LLM backend reachable by the router.
- An API key for one search provider if you want WebSearch to work: Serper, Brave Search, or Exa.
- Claude Code installed on your workstation.
Configure the router¶
Create a config.yaml that (a) defines your self-hosted backend, (b)
enables web_search with a provider of your choice, and © declares
model_aliases so Claude Code's hard-coded model names route to your
backend.
server:
bind_address: 0.0.0.0:8000
backends:
- name: my-self-hosted
type: vllm # or ollama | llamacpp | lmstudio | mlxcel | generic
url: http://localhost:11434
api_key: "${BACKEND_API_KEY}"
models:
- my-model
web_search:
enabled: true
provider: brave # serper | brave | exa
api_key: "${BRAVE_API_KEY}"
# Rewrite Claude Code's internal haiku / sonnet / opus calls so they
# reach your self-hosted model. Without this, WebFetch and WebSearch
# will fail with ModelNotFound.
model_aliases:
haiku: my-model
sonnet: my-model
opus: my-model
default: my-model
Start the router:
Verify it is reachable:
Point Claude Code at the router¶
Claude Code reads two environment variables to choose an upstream Anthropic-compatible API:
ANTHROPIC_BASE_URL— the base URL the Anthropic SDK will hit. Point it at the router's Anthropic endpoint:http://localhost:8000/anthropic. (The/v1/messagessuffix is appended by the SDK itself.)ANTHROPIC_API_KEY— forwarded to the router as thex-api-keyheader. If your router'sapi_keyssection is in permissive mode, any non-empty value works; otherwise use a key your router'sapi_keysstore accepts.
Launch Claude Code with the variables set:
export ANTHROPIC_BASE_URL="http://localhost:8000/anthropic"
export ANTHROPIC_API_KEY="dev-key-anything-works-if-permissive"
claude
To pin the default model Claude Code uses for the main conversation
turn (otherwise it defaults to a Claude sonnet model name which
model_aliases will rewrite anyway), set one of:
Verify¶
Inside the Claude Code session, three quick checks cover every code path the router adds:
Main conversation turn¶
The router's log should show one
POST /anthropic/v1/messages forwarded to your backend, and the
response should come from your self-hosted model.
WebFetch¶
Claude Code fires off an intermediate call with
model: claude-haiku-4-5-20251001. The router logs
Rewrote request.model via model_aliases and forwards the call to
your backend, which summarizes the fetched page.
WebSearch¶
Claude Code sends an intermediate request with an Anthropic server
tool (web_search_20250305) forcing tool_choice. The router:
- issues one backend turn to extract the query,
- calls the configured search provider once,
- returns an Anthropic-shaped response containing
server_tool_useandweb_search_tool_resultcontent blocks.
The log shows the emulation path kicking in:
Claude Code's WebSearch panel shows the ranked results.
Troubleshooting¶
HTTP 422: missing field input_schema¶
You are running a router version older than the web_search_20250305
support. Update the router; the fix
relaxes AnthropicTool to accept both custom and server tool shapes.
Model claude-haiku-4-5-20251001 not found¶
model_aliases is not configured — Claude Code's intermediate haiku
calls have nowhere to land. Add at least a default: entry mapping
to your backend's model.
WebSearch returns zero results¶
Check the router log for web_search loop round complete:
- If the loop is running but Brave / Serper / Exa returns HTTP 429, you are hitting the provider's rate limit. The provider's free tier may be capped at a low QPS; upgrade the plan or switch providers.
- If you see
Rewrote request.modelfollowed by the emulation log line above but no results reach Claude Code, verify the provider's API key is valid.
Claude Code fails to connect¶
Check:
ANTHROPIC_BASE_URLends in/anthropic(no trailing/v1/messages).- The router is reachable:
curl $ANTHROPIC_BASE_URL/v1/messages -H 'content-type: application/json' -d '{"model":"my-model", "max_tokens":10,"messages":[{"role":"user","content":"hi"}]}' - If your router enforces API-key auth,
ANTHROPIC_API_KEYis a key from itsapi_keysstore.
What still does not work¶
- Artifacts / attachments referencing Anthropic-hosted files by
file_id — Claude Code's file-upload flow round-trips through the
Anthropic Files API, which the router does not proxy. Paste content
inline or upload via the router's own Files API (
/v1/files). - Anthropic-side billing / rate-limit headers — Claude Code occasionally surfaces Anthropic-specific quota UI based on response headers the router does not emit.
See also¶
- Web Search — full reference for the
web_searchsection, providers, injection policies, and the Anthropic server-tool emulation path. - Configuration Guide — every config knob.
- API Reference — the router's endpoints.