Security & Admin¶
API Keys Section¶
API keys control client access to the router's endpoints. Keys can be configured through multiple sources.
Authentication Mode¶
The mode setting controls whether API authentication is required for API endpoints:
| Mode | Behavior |
|---|---|
permissive (default) |
Allow requests without API key. Requests with valid API keys are authenticated. |
blocking |
Only process requests that pass API key authentication. Unauthenticated requests receive 401. |
Target Endpoints (when mode is blocking):
- /v1/chat/completions
- /v1/completions
- /v1/responses
- /v1/images/generations
- /v1/images/edits
- /v1/images/variations
- /v1/models
Note: Admin, Files, and Metrics endpoints have separate authentication mechanisms and are not affected by this setting.
Section Configuration Properties:
| Property | Type | Required | Default | Description |
|---|---|---|---|---|
mode |
string | No | permissive |
Authentication mode: permissive or blocking |
api_keys |
array | No | [] |
Inline API key definitions |
api_keys_file |
string | No | - | Path to external API keys file |
api_keys:
# Authentication mode: "permissive" (default) or "blocking"
mode: permissive
# Inline API key definitions
api_keys:
- key: "${API_KEY_1}" # Environment variable substitution
id: "key-production-1" # Unique identifier
user_id: "user-admin" # Associated user
organization_id: "org-main" # Associated organization
name: "Production Admin Key" # Human-readable name
scopes: # Permissions
- read
- write
- files
- admin
rate_limit: 1000 # Requests per minute (optional)
enabled: true # Active status
expires_at: "2025-12-31T23:59:59Z" # Optional expiration (ISO 8601)
- key: "${API_KEY_2}"
id: "key-service-1"
user_id: "service-bot"
organization_id: "org-main"
name: "Service Account"
scopes: [read, write, files]
rate_limit: 500
enabled: true
# External key file for better security
api_keys_file: "/etc/continuum-router/api-keys.yaml"
Key Properties:
| Property | Type | Required | Description |
|---|---|---|---|
key |
string | Yes | The API key value (supports ${ENV_VAR} substitution) |
id |
string | Yes | Unique identifier for admin operations |
user_id |
string | Yes | User associated with this key |
organization_id |
string | Yes | Organization the user belongs to |
name |
string | No | Human-readable name |
description |
string | No | Notes about the key |
scopes |
array | Yes | Permissions: read, write, files, admin |
rate_limit |
integer | No | Maximum requests per minute |
enabled |
boolean | No | Active status (default: true) |
expires_at |
string | No | ISO 8601 expiration timestamp |
allowed_backends |
array | No | Per-key backend allow-list. Empty/absent ⇒ unrestricted. See below. |
External Key File Format:
# /etc/continuum-router/api-keys.yaml
keys:
- key: "sk-prod-xxxxxxxxxxxxxxxxxxxxx"
id: "key-external-1"
user_id: "external-user"
organization_id: "external-org"
scopes: [read, write, files]
enabled: true
Security Features:
- Key Masking: Full keys are never logged (displayed as
sk-***last4) - Expiration Enforcement: Expired keys are automatically rejected
- Hot Reload: Update keys without server restart
- Audit Logging: All key management operations are logged
- Constant-Time Validation: Prevents timing attacks
- Max Key Limit: 10,000 keys maximum to prevent DoS
Admin API Endpoints (require admin authentication):
| Endpoint | Method | Description |
|---|---|---|
/admin/api-keys |
GET | List all keys (masked) |
/admin/api-keys/:id |
GET | Get key details |
/admin/api-keys |
POST | Create new key |
/admin/api-keys/:id |
PUT | Update key properties |
/admin/api-keys/:id |
DELETE | Delete key |
/admin/api-keys/:id/rotate |
POST | Generate new key value |
/admin/api-keys/:id/enable |
POST | Enable key |
/admin/api-keys/:id/disable |
POST | Disable key |
Per-Key Backend Access Control¶
allowed_backends scopes a client key to a subset of configured backends. This is client-key → backend access control, separate from backends[].api_key (the upstream provider credential the router uses to call OpenAI, Anthropic, and the rest).
When the list is non-empty, requests authenticated with that key may only route to the named backends. A request for a model that only a disallowed backend can serve is rejected with 403 Forbidden and a permission_error body. An empty or absent list leaves the key unrestricted, which is the default and matches the behavior of keys created before this field existed.
Semantics:
- Empty / absent ⇒ unrestricted (the key can route to any backend that serves the requested model).
- Non-empty ⇒ allow-list of
backends[].namevalues. Matching is exact and case-sensitive, the same as backend-name resolution elsewhere. - Unservable request ⇒
403 Forbidden({"error": {"type": "permission_error", ...}}). The 403 names the model so operators can tell it apart from a 404 ("model not found"). - Cross-provider fallback is filtered by the same list, so a restricted key cannot escape its scope through a fallback hop.
api_keys:
mode: blocking
api_keys:
- key: "${PARTNER_KEY}"
id: "key-partner-1"
user_id: "partner-acme"
organization_id: "org-acme"
scopes: [read, write]
allowed_backends: [openai, anthropic] # may only route to these backends
- key: "${INTERNAL_KEY}"
id: "key-internal-1"
user_id: "team-ml"
organization_id: "org-internal"
scopes: [read, write]
allowed_backends: [vllm-local] # limited to the self-hosted backend
Mode interaction: the restriction applies in both blocking and permissive mode, but in permissive mode it only takes effect for callers that present a valid key. In permissive mode an authenticated key still attaches its policy (via a best-effort optional-auth step that never rejects), while requests with no key or an invalid key pass through unrestricted, preserving permissive mode's "anonymous welcome" behavior.
Models listing: /v1/models, /v1/models/extended, and /anthropic/v1/models are filtered when a restricted key is authenticated, so a key only advertises models served by at least one allowed backend. GET /v1/models/{model} returns 404 Not Found for a model the key cannot reach. Unauthenticated callers and unrestricted keys see the full list.
Config validation: at startup and on hot-reload, a name in allowed_backends that does not match any backends[].name produces a warning. It is not a hard error, so renaming a backend does not break the router before the operator updates the affected keys.
Admin API: allowed_backends round-trips through the create, update, get, and list endpoints. On create, an absent field defaults to unrestricted. On update, null (absent) leaves the list unchanged, an empty array clears all restrictions, and a non-empty array replaces the list. Runtime keys persist the field through the persistence file.
Anthropic API headers: /anthropic/v1/messages, /anthropic/v1/messages/count_tokens, and /anthropic/v1/models enforce the same per-key backend policy for callers authenticated with Authorization: Bearer <key>. If no Bearer AuthContext is present, a valid native Anthropic x-api-key that matches a router client key also supplies its allowed_backends policy. Invalid or absent keys in permissive mode continue to pass through without a per-key policy, matching the optional-auth behavior above.
Guardrails: PII Detection and Redaction¶
PII detection is one of several guardrail providers. For the full guardrails system (concepts, all five providers, streaming gating, admin controls, metrics, and the threshold-tuning workflow), see the Guardrails guide.
The pii guardrail provider detects personally identifiable information and high-value secrets in request prompts and model responses, then either redacts them in place or blocks the request. It addresses OWASP LLM02 (Sensitive Information Disclosure). Unlike the classify-only providers, its primary action is to transform content: matched spans are replaced with placeholders and the sanitized text flows on.
Built-in scanners run locally with no external dependency and cover emails, US Social Security numbers, credit-card / PAN numbers (validated with the Luhn checksum to suppress false positives), phone numbers, AWS access-key IDs, PEM private-key blocks, and bearer / sk- style API keys. An optional Microsoft Presidio-compatible analyzer can be configured for richer NER-based PII; its spans are merged with the built-in findings, and when it is unavailable the provider degrades according to on_error.
Each detected entity type maps to an action:
mask(the default): replace the matched span with a placeholder, e.g.<REDACTED:EMAIL>, and continue with the sanitized text (aTransformverdict).block: block the request/response when this entity type is present.allow: ignore this entity type (neither mask nor block).
Raw detected values are never written to logs; only entity types and counts are recorded for audit.
guardrails:
enabled: true
mode: enforce
providers:
- name: pii-redaction
type: pii
enabled: true
# Run at both stages (the default): sanitize the prompt before it reaches
# the backend and the response before it reaches the client.
stages: [input, output]
options:
# Action for entity types not listed in `actions`. Default: mask.
default_action: mask
# Per-entity-type action overrides.
actions:
email: mask
phone: mask
ssn: block
credit_card: block
aws_access_key: block
private_key: block
api_key: block
# Placeholder template for masked spans. `{TYPE}` is replaced with the
# upper-cased entity type. Default: "<REDACTED:{TYPE}>".
placeholder_format: "<REDACTED:{TYPE}>"
# Optional external Presidio-compatible analyzer (built-in scanners
# always run; external spans are merged in).
# external:
# endpoint: "http://presidio-analyzer:3000/analyze"
# language: en
# entities: ["PERSON", "LOCATION", "IBAN_CODE"]
on_error: fail_open
Options:
| Property | Type | Default | Description |
|---|---|---|---|
default_action |
string | mask |
Action for entity types not in actions: mask, block, or allow. |
actions |
map | {} |
Per-entity-type action overrides, keyed by entity type (email, phone, ssn, credit_card, aws_access_key, private_key, api_key). |
placeholder_format |
string | <REDACTED:{TYPE}> |
Template for masked spans; {TYPE} is replaced with the upper-cased entity type. |
external.endpoint |
string | - | Presidio-compatible analyzer URL. Falls back to the provider's top-level endpoint. |
external.entities |
array | [] |
Restrict the external analyzer to these entity types (empty = analyzer default). |
external.language |
string | en |
Language hint passed to the analyzer. |
The provider participates in both stages by default. When several entity types are present, a blocking entity wins over masking (most-severe-wins): the verdict is a block; otherwise, if anything was masked, a Transform carries the redacted text; otherwise the content is allowed.
WebUI Section¶
The optional webui section controls the embedded browser-based administration interface. The WebUI is compiled into the binary and served as static assets protected by admin authentication.
webui:
enabled: true # Enable or disable the WebUI (default: true)
path_prefix: /webui # URL path prefix (default: /webui)
Configuration Properties:
| Property | Type | Default | Description |
|---|---|---|---|
enabled |
boolean | true |
Enable or disable the WebUI |
path_prefix |
string | /webui |
URL path prefix. Must start with / and must not contain ... |
When webui is omitted, the defaults apply: WebUI is enabled at /webui. To disable the WebUI:
See Embedded WebUI for a full guide to using the browser interface.
Admin Section¶
The admin section configures the Admin REST API, including authentication and statistics collection.
Authentication¶
admin:
auth:
method: bearer_token # Auth method: none, bearer_token, basic, api_key
token: "${ADMIN_TOKEN}" # Token for bearer_token method
See Admin REST API Reference for all authentication options.
Statistics Collection¶
The admin.stats subsection controls request metrics collection and persistence. Stats collection is enabled by default.
admin:
stats:
enabled: true # Enable/disable collection (default: true)
retention_window: 24h # Ring-buffer retention for windowed queries (default: 24h)
token_tracking: true # Parse response bodies for token usage (default: true)
persistence:
enabled: true # Enable stats persistence across restarts (default: true)
path: ./data/stats.json # File path for the snapshot (default: ./data/stats.json)
snapshot_interval: 5m # How often to write periodic snapshots (default: 5m)
max_age: 7d # Discard snapshots older than this on startup (default: 7d)
Configuration Properties:
| Property | Type | Default | Description |
|---|---|---|---|
enabled |
boolean | true |
Enable or disable statistics collection |
retention_window |
string | 24h |
Ring-buffer retention window for windowed queries |
token_tracking |
boolean | true |
Parse response bodies to extract token usage |
Persistence Properties:
| Property | Type | Default | Description |
|---|---|---|---|
persistence.enabled |
boolean | true |
Enable stats persistence across restarts |
persistence.path |
string | ./data/stats.json |
File path for the persistence snapshot |
persistence.snapshot_interval |
string | 5m |
Interval between periodic snapshots |
persistence.max_age |
string | 7d |
Maximum age for restoring snapshots on startup |
When persistence is enabled:
- On startup, the router restores counters and ring-buffer records from the snapshot file. Uptime always resets to zero.
- A background task writes snapshots atomically (temp file + rename) at the configured interval.
- On graceful shutdown (SIGTERM/SIGINT), a final snapshot is saved.
- Missing, corrupted, or stale snapshots are handled gracefully: the router starts with fresh counters and logs a warning.
Hot Reload: retention_window and token_tracking support immediate hot-reload. Persistence settings (path, snapshot_interval, max_age) require a restart.
Supported duration formats for retention_window, snapshot_interval, and max_age:
| Format | Example | Meaning |
|---|---|---|
Xs |
30s |
30 seconds |
Xm |
5m |
5 minutes |
Xh |
1h |
1 hour |
Xd |
7d |
7 days |
See Admin REST API Reference — Statistics APIs for the full endpoint documentation.
ACP (Agent Communication Protocol) Section¶
The acp section configures the Agent Communication Protocol subsystem. ACP enables IDE and tool integrations to communicate with the router via JSON-RPC 2.0 over stdio. ACP is disabled by default for backward compatibility.
To use ACP, run the router with --mode stdio.
acp:
enabled: true
transport:
stdio:
enabled: true
agent:
name: "Continuum Router"
version: "1.0.0"
description: "Local LLM inference agent"
capabilities:
load_session: true
image: false
audio: false
embedded_context: false
mcp: true
default_model: "gpt-4o"
system_prompt: "You are a helpful coding assistant."
coding_agent_mode: true
permissions:
default_policy: ask_always
auto_allow:
- read
- search
- think
always_ask:
- edit
- delete
- execute
sessions:
max_concurrent: 10
idle_timeout: "1h"
storage: "memory"
mcp:
max_connections_per_session: 5
allowed_servers: []
server_spawn_timeout: "10s"
Top-Level Options¶
| Property | Type | Default | Description |
|---|---|---|---|
enabled |
bool | false |
Enable/disable the ACP subsystem |
default_model |
string | none | Override model selection for ACP sessions |
system_prompt |
string | none | Inject a system prompt into all ACP requests |
coding_agent_mode |
bool | false |
Enable coding agent system prompt |
Transport Options¶
| Property | Type | Default | Description |
|---|---|---|---|
transport.stdio.enabled |
bool | true |
Enable stdio transport |
Permission Options¶
| Property | Type | Default | Description |
|---|---|---|---|
permissions.default_policy |
enum | ask_always |
Default policy: ask_always, allow_read, allow_all |
permissions.auto_allow |
list | [read, search, think] |
Tool kinds auto-allowed without asking |
permissions.always_ask |
list | [edit, delete, execute] |
Tool kinds that always require permission |
Session Options¶
| Property | Type | Default | Description |
|---|---|---|---|
sessions.max_concurrent |
int | 10 |
Maximum concurrent sessions |
sessions.idle_timeout |
string | "1h" |
Idle timeout before session cleanup |
sessions.storage |
string | "memory" |
Storage backend: memory or file |
sessions.storage_path |
string | none | Path for file-based storage |
MCP Bridge Options¶
| Property | Type | Default | Description |
|---|---|---|---|
mcp.max_connections_per_session |
int | 5 |
Max MCP connections per session |
mcp.allowed_servers |
list | [] |
Allowed server IDs (empty = all) |
mcp.server_spawn_timeout |
string | "10s" |
Timeout for spawning MCP server processes |
See ACP Architecture for protocol details and ACP Usage Guide for practical examples.