Security & Admin¶

API Keys Section¶

API keys control client access to the router's endpoints. Keys can be configured through multiple sources.

Authentication Mode¶

The mode setting controls whether API authentication is required for API endpoints:

Mode	Behavior
`permissive` (default)	Allow requests without API key. Requests with valid API keys are authenticated.
`blocking`	Only process requests that pass API key authentication. Unauthenticated requests receive 401.

Target Endpoints (when mode is blocking): - /v1/chat/completions - /v1/completions - /v1/responses - /v1/images/generations - /v1/images/edits - /v1/images/variations - /v1/models

Note: Admin, Files, and Metrics endpoints have separate authentication mechanisms and are not affected by this setting.

Section Configuration Properties:

Property	Type	Required	Default	Description
`mode`	string	No	`permissive`	Authentication mode: `permissive` or `blocking`
`api_keys`	array	No	`[]`	Inline API key definitions
`api_keys_file`	string	No	-	Path to external API keys file

api_keys:
  # Authentication mode: "permissive" (default) or "blocking"
  mode: permissive

  # Inline API key definitions
  api_keys:
        - key: "${API_KEY_1}"              # Environment variable substitution
      id: "key-production-1"           # Unique identifier
      user_id: "user-admin"            # Associated user
      organization_id: "org-main"      # Associated organization
      name: "Production Admin Key"     # Human-readable name
      scopes:                          # Permissions
        - read
        - write
        - files
        - admin
      rate_limit: 1000                 # Requests per minute (optional)
      enabled: true                    # Active status
      expires_at: "2025-12-31T23:59:59Z"  # Optional expiration (ISO 8601)

        - key: "${API_KEY_2}"
      id: "key-service-1"
      user_id: "service-bot"
      organization_id: "org-main"
      name: "Service Account"
      scopes: [read, write, files]
      rate_limit: 500
      enabled: true

  # External key file for better security
  api_keys_file: "/etc/continuum-router/api-keys.yaml"

Key Properties:

Property	Type	Required	Description
`key`	string	Yes	The API key value (supports `${ENV_VAR}` substitution)
`id`	string	Yes	Unique identifier for admin operations
`user_id`	string	Yes	User associated with this key
`organization_id`	string	Yes	Organization the user belongs to
`name`	string	No	Human-readable name
`description`	string	No	Notes about the key
`scopes`	array	Yes	Permissions: `read`, `write`, `files`, `admin`
`rate_limit`	integer	No	Maximum requests per minute
`enabled`	boolean	No	Active status (default: true)
`expires_at`	string	No	ISO 8601 expiration timestamp
`allowed_backends`	array	No	Per-key backend allow-list. Empty/absent ⇒ unrestricted. See below.

External Key File Format:

# /etc/continuum-router/api-keys.yaml
keys:
    - key: "sk-prod-xxxxxxxxxxxxxxxxxxxxx"
    id: "key-external-1"
    user_id: "external-user"
    organization_id: "external-org"
    scopes: [read, write, files]
    enabled: true

Security Features:

Key Masking: Full keys are never logged (displayed as sk-***last4)
Expiration Enforcement: Expired keys are automatically rejected
Hot Reload: Update keys without server restart
Audit Logging: All key management operations are logged
Constant-Time Validation: Prevents timing attacks
Max Key Limit: 10,000 keys maximum to prevent DoS

Admin API Endpoints (require admin authentication):

Endpoint	Method	Description
`/admin/api-keys`	GET	List all keys (masked)
`/admin/api-keys/:id`	GET	Get key details
`/admin/api-keys`	POST	Create new key
`/admin/api-keys/:id`	PUT	Update key properties
`/admin/api-keys/:id`	DELETE	Delete key
`/admin/api-keys/:id/rotate`	POST	Generate new key value
`/admin/api-keys/:id/enable`	POST	Enable key
`/admin/api-keys/:id/disable`	POST	Disable key

Per-Key Backend Access Control¶

allowed_backends scopes a client key to a subset of configured backends. This is client-key → backend access control, separate from backends[].api_key (the upstream provider credential the router uses to call OpenAI, Anthropic, and the rest).

When the list is non-empty, requests authenticated with that key may only route to the named backends. A request for a model that only a disallowed backend can serve is rejected with 403 Forbidden and a permission_error body. An empty or absent list leaves the key unrestricted, which is the default and matches the behavior of keys created before this field existed.

Semantics:

Empty / absent ⇒ unrestricted (the key can route to any backend that serves the requested model).
Non-empty ⇒ allow-list of backends[].name values. Matching is exact and case-sensitive, the same as backend-name resolution elsewhere.
Unservable request ⇒ 403 Forbidden ({"error": {"type": "permission_error", ...}}). The 403 names the model so operators can tell it apart from a 404 ("model not found").
Cross-provider fallback is filtered by the same list, so a restricted key cannot escape its scope through a fallback hop.

api_keys:
  mode: blocking
  api_keys:
    - key: "${PARTNER_KEY}"
      id: "key-partner-1"
      user_id: "partner-acme"
      organization_id: "org-acme"
      scopes: [read, write]
      allowed_backends: [openai, anthropic]   # may only route to these backends

    - key: "${INTERNAL_KEY}"
      id: "key-internal-1"
      user_id: "team-ml"
      organization_id: "org-internal"
      scopes: [read, write]
      allowed_backends: [vllm-local]          # limited to the self-hosted backend

Mode interaction: the restriction applies in both blocking and permissive mode, but in permissive mode it only takes effect for callers that present a valid key. In permissive mode an authenticated key still attaches its policy (via a best-effort optional-auth step that never rejects), while requests with no key or an invalid key pass through unrestricted, preserving permissive mode's "anonymous welcome" behavior.

Models listing: /v1/models, /v1/models/extended, and /anthropic/v1/models are filtered when a restricted key is authenticated, so a key only advertises models served by at least one allowed backend. GET /v1/models/{model} returns 404 Not Found for a model the key cannot reach. Unauthenticated callers and unrestricted keys see the full list.

Config validation: at startup and on hot-reload, a name in allowed_backends that does not match any backends[].name produces a warning. It is not a hard error, so renaming a backend does not break the router before the operator updates the affected keys.

Admin API: allowed_backends round-trips through the create, update, get, and list endpoints. On create, an absent field defaults to unrestricted. On update, null (absent) leaves the list unchanged, an empty array clears all restrictions, and a non-empty array replaces the list. Runtime keys persist the field through the persistence file.

Anthropic API headers: /anthropic/v1/messages, /anthropic/v1/messages/count_tokens, and /anthropic/v1/models enforce the same per-key backend policy for callers authenticated with Authorization: Bearer <key>. If no Bearer AuthContext is present, a valid native Anthropic x-api-key that matches a router client key also supplies its allowed_backends policy. Invalid or absent keys in permissive mode continue to pass through without a per-key policy, matching the optional-auth behavior above.

Guardrails: PII Detection and Redaction¶

PII detection is one of several guardrail providers. For the full guardrails system (concepts, all five providers, streaming gating, admin controls, metrics, and the threshold-tuning workflow), see the Guardrails guide.

The pii guardrail provider detects personally identifiable information and high-value secrets in request prompts and model responses, then either redacts them in place or blocks the request. It addresses OWASP LLM02 (Sensitive Information Disclosure). Unlike the classify-only providers, its primary action is to transform content: matched spans are replaced with placeholders and the sanitized text flows on.

Built-in scanners run locally with no external dependency and cover emails, US Social Security numbers, credit-card / PAN numbers (validated with the Luhn checksum to suppress false positives), phone numbers, AWS access-key IDs, PEM private-key blocks, and bearer / sk- style API keys. An optional Microsoft Presidio-compatible analyzer can be configured for richer NER-based PII; its spans are merged with the built-in findings, and when it is unavailable the provider degrades according to on_error.

Each detected entity type maps to an action:

mask (the default): replace the matched span with a placeholder, e.g. <REDACTED:EMAIL>, and continue with the sanitized text (a Transform verdict).
block: block the request/response when this entity type is present.
allow: ignore this entity type (neither mask nor block).

Raw detected values are never written to logs; only entity types and counts are recorded for audit.

guardrails:
  enabled: true
  mode: enforce
  providers:
    - name: pii-redaction
      type: pii
      enabled: true
      # Run at both stages (the default): sanitize the prompt before it reaches
      # the backend and the response before it reaches the client.
      stages: [input, output]
      options:
        # Action for entity types not listed in `actions`. Default: mask.
        default_action: mask
        # Per-entity-type action overrides.
        actions:
          email: mask
          phone: mask
          ssn: block
          credit_card: block
          aws_access_key: block
          private_key: block
          api_key: block
        # Placeholder template for masked spans. `{TYPE}` is replaced with the
        # upper-cased entity type. Default: "<REDACTED:{TYPE}>".
        placeholder_format: "<REDACTED:{TYPE}>"
        # Optional external Presidio-compatible analyzer (built-in scanners
        # always run; external spans are merged in).
        # external:
        #   endpoint: "http://presidio-analyzer:3000/analyze"
        #   language: en
        #   entities: ["PERSON", "LOCATION", "IBAN_CODE"]
      on_error: fail_open

Options:

Property	Type	Default	Description
`default_action`	string	`mask`	Action for entity types not in `actions`: `mask`, `block`, or `allow`.
`actions`	map	`{}`	Per-entity-type action overrides, keyed by entity type (`email`, `phone`, `ssn`, `credit_card`, `aws_access_key`, `private_key`, `api_key`).
`placeholder_format`	string	`<REDACTED:{TYPE}>`	Template for masked spans; `{TYPE}` is replaced with the upper-cased entity type.
`external.endpoint`	string	-	Presidio-compatible analyzer URL. Falls back to the provider's top-level `endpoint`.
`external.entities`	array	`[]`	Restrict the external analyzer to these entity types (empty = analyzer default).
`external.language`	string	`en`	Language hint passed to the analyzer.

The provider participates in both stages by default. When several entity types are present, a blocking entity wins over masking (most-severe-wins): the verdict is a block; otherwise, if anything was masked, a Transform carries the redacted text; otherwise the content is allowed.

WebUI Section¶

The optional webui section controls the embedded browser-based administration interface. The WebUI is compiled into the binary and served as static assets protected by admin authentication.

webui:
  enabled: true        # Enable or disable the WebUI (default: true)
  path_prefix: /webui  # URL path prefix (default: /webui)

Configuration Properties:

Property	Type	Default	Description
`enabled`	boolean	`true`	Enable or disable the WebUI
`path_prefix`	string	`/webui`	URL path prefix. Must start with `/` and must not contain `..`.

When webui is omitted, the defaults apply: WebUI is enabled at /webui. To disable the WebUI:

webui:
  enabled: false

See Embedded WebUI for a full guide to using the browser interface.

Admin Section¶

The admin section configures the Admin REST API, including authentication and statistics collection.

Authentication¶

admin:
  auth:
    method: bearer_token       # Auth method: none, bearer_token, basic, api_key
    token: "${ADMIN_TOKEN}"    # Token for bearer_token method

See Admin REST API Reference for all authentication options.

Statistics Collection¶

The admin.stats subsection controls request metrics collection and persistence. Stats collection is enabled by default.

admin:
  stats:
    enabled: true                # Enable/disable collection (default: true)
    retention_window: 24h        # Ring-buffer retention for windowed queries (default: 24h)
    token_tracking: true         # Parse response bodies for token usage (default: true)
    persistence:
      enabled: true              # Enable stats persistence across restarts (default: true)
      path: ./data/stats.json    # File path for the snapshot (default: ./data/stats.json)
      snapshot_interval: 5m      # How often to write periodic snapshots (default: 5m)
      max_age: 7d                # Discard snapshots older than this on startup (default: 7d)

Configuration Properties:

Property	Type	Default	Description
`enabled`	boolean	`true`	Enable or disable statistics collection
`retention_window`	string	`24h`	Ring-buffer retention window for windowed queries
`token_tracking`	boolean	`true`	Parse response bodies to extract token usage

Persistence Properties:

Property	Type	Default	Description
`persistence.enabled`	boolean	`true`	Enable stats persistence across restarts
`persistence.path`	string	`./data/stats.json`	File path for the persistence snapshot
`persistence.snapshot_interval`	string	`5m`	Interval between periodic snapshots
`persistence.max_age`	string	`7d`	Maximum age for restoring snapshots on startup

When persistence is enabled:

On startup, the router restores counters and ring-buffer records from the snapshot file. Uptime always resets to zero.
A background task writes snapshots atomically (temp file + rename) at the configured interval.
On graceful shutdown (SIGTERM/SIGINT), a final snapshot is saved.
Missing, corrupted, or stale snapshots are handled gracefully: the router starts with fresh counters and logs a warning.

Hot Reload: retention_window and token_tracking support immediate hot-reload. Persistence settings (path, snapshot_interval, max_age) require a restart.

Supported duration formats for retention_window, snapshot_interval, and max_age:

Format	Example	Meaning
`Xs`	`30s`	30 seconds
`Xm`	`5m`	5 minutes
`Xh`	`1h`	1 hour
`Xd`	`7d`	7 days

See Admin REST API Reference — Statistics APIs for the full endpoint documentation.

ACP (Agent Communication Protocol) Section¶

The acp section configures the Agent Communication Protocol subsystem. ACP enables IDE and tool integrations to communicate with the router via JSON-RPC 2.0 over stdio. ACP is disabled by default for backward compatibility.

To use ACP, run the router with --mode stdio.

acp:
  enabled: true

  transport:
    stdio:
      enabled: true

  agent:
    name: "Continuum Router"
    version: "1.0.0"
    description: "Local LLM inference agent"

  capabilities:
    load_session: true
    image: false
    audio: false
    embedded_context: false
    mcp: true

  default_model: "gpt-4o"
  system_prompt: "You are a helpful coding assistant."
  coding_agent_mode: true

  permissions:
    default_policy: ask_always
    auto_allow:
      - read
      - search
      - think
    always_ask:
      - edit
      - delete
      - execute

  sessions:
    max_concurrent: 10
    idle_timeout: "1h"
    storage: "memory"

  mcp:
    max_connections_per_session: 5
    allowed_servers: []
    server_spawn_timeout: "10s"

Top-Level Options¶

Property	Type	Default	Description
`enabled`	bool	`false`	Enable/disable the ACP subsystem
`default_model`	string	none	Override model selection for ACP sessions
`system_prompt`	string	none	Inject a system prompt into all ACP requests
`coding_agent_mode`	bool	`false`	Enable coding agent system prompt

Transport Options¶

Property	Type	Default	Description
`transport.stdio.enabled`	bool	`true`	Enable stdio transport

Permission Options¶

Property	Type	Default	Description
`permissions.default_policy`	enum	`ask_always`	Default policy: `ask_always`, `allow_read`, `allow_all`
`permissions.auto_allow`	list	`[read, search, think]`	Tool kinds auto-allowed without asking
`permissions.always_ask`	list	`[edit, delete, execute]`	Tool kinds that always require permission

Session Options¶

Property	Type	Default	Description
`sessions.max_concurrent`	int	`10`	Maximum concurrent sessions
`sessions.idle_timeout`	string	`"1h"`	Idle timeout before session cleanup
`sessions.storage`	string	`"memory"`	Storage backend: `memory` or `file`
`sessions.storage_path`	string	none	Path for file-based storage

MCP Bridge Options¶

Property	Type	Default	Description
`mcp.max_connections_per_session`	int	`5`	Max MCP connections per session
`mcp.allowed_servers`	list	`[]`	Allowed server IDs (empty = all)
`mcp.server_spawn_timeout`	string	`"10s"`	Timeout for spawning MCP server processes

See ACP Architecture for protocol details and ACP Usage Guide for practical examples.