Skip to content

Integrating with Claude Code

Claude Code is Anthropic's official CLI. It speaks the Anthropic Messages API, so it can be pointed at any Anthropic-compatible endpoint, including Continuum Router, via the ANTHROPIC_BASE_URL environment variable.

This guide walks through making Claude Code fully functional against a self-hosted LLM (vLLM, Ollama, llama.cpp, LM Studio, MLxcel, or any other OpenAI-compatible backend) served through Continuum Router, including the two features Claude Code ships that assume Anthropic's native backend: WebFetch and WebSearch.

What Claude Code expects

Claude Code is built for Anthropic's hosted API. Out of the box it hard-codes a handful of behaviors that need router-side translation when the upstream is a self-hosted model:

  • It talks to /v1/messages (Anthropic Messages format), not /v1/chat/completions.
  • It issues intermediate calls to a small haiku-class model (claude-haiku-4-5-20251001 at the time of writing) for WebFetch's summarizer and WebSearch's query extractor.
  • Its WebSearch feature sends Anthropic's native server tool (web_search_20250305) and expects the response to include server_tool_use + web_search_tool_result content blocks.

Continuum Router handles all three transparently when its web_search and model_aliases sections are configured; see Configure the router below.

Prerequisites

Before starting:

  • Continuum Router installed (see the Installation page).
  • At least one self-hosted, OpenAI-compatible LLM backend reachable by the router.
  • An API key for one search provider if you want WebSearch to work: Serper, Brave Search, or Exa.
  • Claude Code installed on your workstation.

Configure the router

Create a config.yaml that (a) defines your self-hosted backend, (b) enables web_search with a provider of your choice, and © declares model_aliases so Claude Code's hard-coded model names route to your backend.

server:
  bind_address: 0.0.0.0:8000

backends:
  - name: my-self-hosted
    type: vllm                  # or ollama | llamacpp | lmstudio | mlxcel | generic
    url: http://localhost:11434
    api_key: "${BACKEND_API_KEY}"
    models:
      - my-model

web_search:
  enabled: true
  provider: brave               # serper | brave | exa
  api_key: "${BRAVE_API_KEY}"

# Rewrite Claude Code's internal haiku / sonnet / opus calls so they
# reach your self-hosted model. Without this, WebFetch and WebSearch
# will fail with ModelNotFound.
model_aliases:
  haiku: my-model
  sonnet: my-model
  opus: my-model
  default: my-model

Start the router:

continuum-router --config config.yaml

Verify it is reachable:

curl http://localhost:8000/health
# {"status":"ok"}

Point Claude Code at the router

Claude Code reads two environment variables to choose an upstream Anthropic-compatible API:

  • ANTHROPIC_BASE_URL — the base URL the Anthropic SDK will hit. Point it at the router's Anthropic endpoint: http://localhost:8000/anthropic. (The /v1/messages suffix is appended by the SDK itself.)
  • ANTHROPIC_API_KEY — forwarded to the router as the x-api-key header. If your router's api_keys section is in permissive mode, any non-empty value works; otherwise use a key your router's api_keys store accepts.

Launch Claude Code with the variables set:

export ANTHROPIC_BASE_URL="http://localhost:8000/anthropic"
export ANTHROPIC_API_KEY="dev-key-anything-works-if-permissive"
claude

To pin the default model Claude Code uses for the main conversation turn (otherwise it defaults to a Claude sonnet model name which model_aliases will rewrite anyway), set one of:

export ANTHROPIC_MODEL="my-model"
# or start the session explicitly: `claude --model my-model`

Verify

Inside the Claude Code session, three quick checks cover every code path the router adds:

Main conversation turn

> Hi, what backend am I talking to?

The router's log should show one POST /anthropic/v1/messages forwarded to your backend, and the response should come from your self-hosted model.

WebFetch

> Summarize https://www.rust-lang.org/ in one sentence.

Claude Code fires off an intermediate call with model: claude-haiku-4-5-20251001. The router logs Rewrote request.model via model_aliases and forwards the call to your backend, which summarizes the fetched page.

WebSearch

> Use web search to find the latest Rust release.

Claude Code sends an intermediate request with an Anthropic server tool (web_search_20250305) forcing tool_choice. The router:

  1. issues one backend turn to extract the query,
  2. calls the configured search provider once,
  3. returns an Anthropic-shaped response containing server_tool_use and web_search_tool_result content blocks.

The log shows the emulation path kicking in:

INFO  Anthropic web_search server-tool emulation (streaming): extracting query

Claude Code's WebSearch panel shows the ranked results.

Troubleshooting

HTTP 422: missing field input_schema

You are running a router version older than the web_search_20250305 support. Update the router; the fix relaxes AnthropicTool to accept both custom and server tool shapes.

Model claude-haiku-4-5-20251001 not found

model_aliases is not configured — Claude Code's intermediate haiku calls have nowhere to land. Add at least a default: entry mapping to your backend's model.

WebSearch returns zero results

Check the router log for web_search loop round complete:

  • If the loop is running but Brave / Serper / Exa returns HTTP 429, you are hitting the provider's rate limit. The provider's free tier may be capped at a low QPS; upgrade the plan or switch providers.
  • If you see Rewrote request.model followed by the emulation log line above but no results reach Claude Code, verify the provider's API key is valid.

Claude Code fails to connect

Check:

  • ANTHROPIC_BASE_URL ends in /anthropic (no trailing /v1/messages).
  • The router is reachable: curl $ANTHROPIC_BASE_URL/v1/messages -H 'content-type: application/json' -d '{"model":"my-model", "max_tokens":10,"messages":[{"role":"user","content":"hi"}]}'
  • If your router enforces API-key auth, ANTHROPIC_API_KEY is a key from its api_keys store.

What still does not work

  • Artifacts / attachments referencing Anthropic-hosted files by file_id — Claude Code's file-upload flow round-trips through the Anthropic Files API, which the router does not proxy. Paste content inline or upload via the router's own Files API (/v1/files).
  • Anthropic-side billing / rate-limit headers — Claude Code occasionally surfaces Anthropic-specific quota UI based on response headers the router does not emit.

See also

  • Web Search — full reference for the web_search section, providers, injection policies, and the Anthropic server-tool emulation path.
  • Configuration Guide — every config knob.
  • API Reference — the router's endpoints.