Skip to content

Security & Admin

API Keys Section

API keys control client access to the router's endpoints. Keys can be configured through multiple sources.

Authentication Mode

The mode setting controls whether API authentication is required for API endpoints:

Mode Behavior
permissive (default) Allow requests without API key. Requests with valid API keys are authenticated.
blocking Only process requests that pass API key authentication. Unauthenticated requests receive 401.

Target Endpoints (when mode is blocking): - /v1/chat/completions - /v1/completions - /v1/responses - /v1/images/generations - /v1/images/edits - /v1/images/variations - /v1/models

Note: Admin, Files, and Metrics endpoints have separate authentication mechanisms and are not affected by this setting.

Section Configuration Properties:

Property Type Required Default Description
mode string No permissive Authentication mode: permissive or blocking
api_keys array No [] Inline API key definitions
api_keys_file string No - Path to external API keys file
api_keys:
  # Authentication mode: "permissive" (default) or "blocking"
  mode: permissive

  # Inline API key definitions
  api_keys:
        - key: "${API_KEY_1}"              # Environment variable substitution
      id: "key-production-1"           # Unique identifier
      user_id: "user-admin"            # Associated user
      organization_id: "org-main"      # Associated organization
      name: "Production Admin Key"     # Human-readable name
      scopes:                          # Permissions
        - read
        - write
        - files
        - admin
      rate_limit: 1000                 # Requests per minute (optional)
      enabled: true                    # Active status
      expires_at: "2025-12-31T23:59:59Z"  # Optional expiration (ISO 8601)

        - key: "${API_KEY_2}"
      id: "key-service-1"
      user_id: "service-bot"
      organization_id: "org-main"
      name: "Service Account"
      scopes: [read, write, files]
      rate_limit: 500
      enabled: true

  # External key file for better security
  api_keys_file: "/etc/continuum-router/api-keys.yaml"

Key Properties:

Property Type Required Description
key string Yes The API key value (supports ${ENV_VAR} substitution)
id string Yes Unique identifier for admin operations
user_id string Yes User associated with this key
organization_id string Yes Organization the user belongs to
name string No Human-readable name
description string No Notes about the key
scopes array Yes Permissions: read, write, files, admin
rate_limit integer No Maximum requests per minute
enabled boolean No Active status (default: true)
expires_at string No ISO 8601 expiration timestamp
allowed_backends array No Per-key backend allow-list. Empty/absent ⇒ unrestricted. See below.

External Key File Format:

# /etc/continuum-router/api-keys.yaml
keys:
    - key: "sk-prod-xxxxxxxxxxxxxxxxxxxxx"
    id: "key-external-1"
    user_id: "external-user"
    organization_id: "external-org"
    scopes: [read, write, files]
    enabled: true

Security Features:

  • Key Masking: Full keys are never logged (displayed as sk-***last4)
  • Expiration Enforcement: Expired keys are automatically rejected
  • Hot Reload: Update keys without server restart
  • Audit Logging: All key management operations are logged
  • Constant-Time Validation: Prevents timing attacks
  • Max Key Limit: 10,000 keys maximum to prevent DoS

Admin API Endpoints (require admin authentication):

Endpoint Method Description
/admin/api-keys GET List all keys (masked)
/admin/api-keys/:id GET Get key details
/admin/api-keys POST Create new key
/admin/api-keys/:id PUT Update key properties
/admin/api-keys/:id DELETE Delete key
/admin/api-keys/:id/rotate POST Generate new key value
/admin/api-keys/:id/enable POST Enable key
/admin/api-keys/:id/disable POST Disable key

Per-Key Backend Access Control

allowed_backends scopes a client key to a subset of configured backends. This is client-key → backend access control, separate from backends[].api_key (the upstream provider credential the router uses to call OpenAI, Anthropic, and the rest).

When the list is non-empty, requests authenticated with that key may only route to the named backends. A request for a model that only a disallowed backend can serve is rejected with 403 Forbidden and a permission_error body. An empty or absent list leaves the key unrestricted, which is the default and matches the behavior of keys created before this field existed.

Semantics:

  • Empty / absent ⇒ unrestricted (the key can route to any backend that serves the requested model).
  • Non-empty ⇒ allow-list of backends[].name values. Matching is exact and case-sensitive, the same as backend-name resolution elsewhere.
  • Unservable request403 Forbidden ({"error": {"type": "permission_error", ...}}). The 403 names the model so operators can tell it apart from a 404 ("model not found").
  • Cross-provider fallback is filtered by the same list, so a restricted key cannot escape its scope through a fallback hop.
api_keys:
  mode: blocking
  api_keys:
    - key: "${PARTNER_KEY}"
      id: "key-partner-1"
      user_id: "partner-acme"
      organization_id: "org-acme"
      scopes: [read, write]
      allowed_backends: [openai, anthropic]   # may only route to these backends

    - key: "${INTERNAL_KEY}"
      id: "key-internal-1"
      user_id: "team-ml"
      organization_id: "org-internal"
      scopes: [read, write]
      allowed_backends: [vllm-local]          # limited to the self-hosted backend

Mode interaction: the restriction applies in both blocking and permissive mode, but in permissive mode it only takes effect for callers that present a valid key. In permissive mode an authenticated key still attaches its policy (via a best-effort optional-auth step that never rejects), while requests with no key or an invalid key pass through unrestricted, preserving permissive mode's "anonymous welcome" behavior.

Models listing: /v1/models, /v1/models/extended, and /anthropic/v1/models are filtered when a restricted key is authenticated, so a key only advertises models served by at least one allowed backend. GET /v1/models/{model} returns 404 Not Found for a model the key cannot reach. Unauthenticated callers and unrestricted keys see the full list.

Config validation: at startup and on hot-reload, a name in allowed_backends that does not match any backends[].name produces a warning. It is not a hard error, so renaming a backend does not break the router before the operator updates the affected keys.

Admin API: allowed_backends round-trips through the create, update, get, and list endpoints. On create, an absent field defaults to unrestricted. On update, null (absent) leaves the list unchanged, an empty array clears all restrictions, and a non-empty array replaces the list. Runtime keys persist the field through the persistence file.

Anthropic API headers: /anthropic/v1/messages, /anthropic/v1/messages/count_tokens, and /anthropic/v1/models enforce the same per-key backend policy for callers authenticated with Authorization: Bearer <key>. If no Bearer AuthContext is present, a valid native Anthropic x-api-key that matches a router client key also supplies its allowed_backends policy. Invalid or absent keys in permissive mode continue to pass through without a per-key policy, matching the optional-auth behavior above.

Guardrails: PII Detection and Redaction

PII detection is one of several guardrail providers. For the full guardrails system (concepts, all five providers, streaming gating, admin controls, metrics, and the threshold-tuning workflow), see the Guardrails guide.

The pii guardrail provider detects personally identifiable information and high-value secrets in request prompts and model responses, then either redacts them in place or blocks the request. It addresses OWASP LLM02 (Sensitive Information Disclosure). Unlike the classify-only providers, its primary action is to transform content: matched spans are replaced with placeholders and the sanitized text flows on.

Built-in scanners run locally with no external dependency and cover emails, US Social Security numbers, credit-card / PAN numbers (validated with the Luhn checksum to suppress false positives), phone numbers, AWS access-key IDs, PEM private-key blocks, and bearer / sk- style API keys. An optional Microsoft Presidio-compatible analyzer can be configured for richer NER-based PII; its spans are merged with the built-in findings, and when it is unavailable the provider degrades according to on_error.

Each detected entity type maps to an action:

  • mask (the default): replace the matched span with a placeholder, e.g. <REDACTED:EMAIL>, and continue with the sanitized text (a Transform verdict).
  • block: block the request/response when this entity type is present.
  • allow: ignore this entity type (neither mask nor block).

Raw detected values are never written to logs; only entity types and counts are recorded for audit.

guardrails:
  enabled: true
  mode: enforce
  providers:
    - name: pii-redaction
      type: pii
      enabled: true
      # Run at both stages (the default): sanitize the prompt before it reaches
      # the backend and the response before it reaches the client.
      stages: [input, output]
      options:
        # Action for entity types not listed in `actions`. Default: mask.
        default_action: mask
        # Per-entity-type action overrides.
        actions:
          email: mask
          phone: mask
          ssn: block
          credit_card: block
          aws_access_key: block
          private_key: block
          api_key: block
        # Placeholder template for masked spans. `{TYPE}` is replaced with the
        # upper-cased entity type. Default: "<REDACTED:{TYPE}>".
        placeholder_format: "<REDACTED:{TYPE}>"
        # Optional external Presidio-compatible analyzer (built-in scanners
        # always run; external spans are merged in).
        # external:
        #   endpoint: "http://presidio-analyzer:3000/analyze"
        #   language: en
        #   entities: ["PERSON", "LOCATION", "IBAN_CODE"]
      on_error: fail_open

Options:

Property Type Default Description
default_action string mask Action for entity types not in actions: mask, block, or allow.
actions map {} Per-entity-type action overrides, keyed by entity type (email, phone, ssn, credit_card, aws_access_key, private_key, api_key).
placeholder_format string <REDACTED:{TYPE}> Template for masked spans; {TYPE} is replaced with the upper-cased entity type.
external.endpoint string - Presidio-compatible analyzer URL. Falls back to the provider's top-level endpoint.
external.entities array [] Restrict the external analyzer to these entity types (empty = analyzer default).
external.language string en Language hint passed to the analyzer.

The provider participates in both stages by default. When several entity types are present, a blocking entity wins over masking (most-severe-wins): the verdict is a block; otherwise, if anything was masked, a Transform carries the redacted text; otherwise the content is allowed.

WebUI Section

The optional webui section controls the embedded browser-based administration interface. The WebUI is compiled into the binary and served as static assets protected by admin authentication.

webui:
  enabled: true        # Enable or disable the WebUI (default: true)
  path_prefix: /webui  # URL path prefix (default: /webui)

Configuration Properties:

Property Type Default Description
enabled boolean true Enable or disable the WebUI
path_prefix string /webui URL path prefix. Must start with / and must not contain ...

When webui is omitted, the defaults apply: WebUI is enabled at /webui. To disable the WebUI:

webui:
  enabled: false

See Embedded WebUI for a full guide to using the browser interface.

Admin Section

The admin section configures the Admin REST API, including authentication and statistics collection.

Authentication

admin:
  auth:
    method: bearer_token       # Auth method: none, bearer_token, basic, api_key
    token: "${ADMIN_TOKEN}"    # Token for bearer_token method

See Admin REST API Reference for all authentication options.

Statistics Collection

The admin.stats subsection controls request metrics collection and persistence. Stats collection is enabled by default.

admin:
  stats:
    enabled: true                # Enable/disable collection (default: true)
    retention_window: 24h        # Ring-buffer retention for windowed queries (default: 24h)
    token_tracking: true         # Parse response bodies for token usage (default: true)
    persistence:
      enabled: true              # Enable stats persistence across restarts (default: true)
      path: ./data/stats.json    # File path for the snapshot (default: ./data/stats.json)
      snapshot_interval: 5m      # How often to write periodic snapshots (default: 5m)
      max_age: 7d                # Discard snapshots older than this on startup (default: 7d)

Configuration Properties:

Property Type Default Description
enabled boolean true Enable or disable statistics collection
retention_window string 24h Ring-buffer retention window for windowed queries
token_tracking boolean true Parse response bodies to extract token usage

Persistence Properties:

Property Type Default Description
persistence.enabled boolean true Enable stats persistence across restarts
persistence.path string ./data/stats.json File path for the persistence snapshot
persistence.snapshot_interval string 5m Interval between periodic snapshots
persistence.max_age string 7d Maximum age for restoring snapshots on startup

When persistence is enabled:

  • On startup, the router restores counters and ring-buffer records from the snapshot file. Uptime always resets to zero.
  • A background task writes snapshots atomically (temp file + rename) at the configured interval.
  • On graceful shutdown (SIGTERM/SIGINT), a final snapshot is saved.
  • Missing, corrupted, or stale snapshots are handled gracefully: the router starts with fresh counters and logs a warning.

Hot Reload: retention_window and token_tracking support immediate hot-reload. Persistence settings (path, snapshot_interval, max_age) require a restart.

Supported duration formats for retention_window, snapshot_interval, and max_age:

Format Example Meaning
Xs 30s 30 seconds
Xm 5m 5 minutes
Xh 1h 1 hour
Xd 7d 7 days

See Admin REST API Reference — Statistics APIs for the full endpoint documentation.

ACP (Agent Communication Protocol) Section

The acp section configures the Agent Communication Protocol subsystem. ACP enables IDE and tool integrations to communicate with the router via JSON-RPC 2.0 over stdio. ACP is disabled by default for backward compatibility.

To use ACP, run the router with --mode stdio.

acp:
  enabled: true

  transport:
    stdio:
      enabled: true

  agent:
    name: "Continuum Router"
    version: "1.0.0"
    description: "Local LLM inference agent"

  capabilities:
    load_session: true
    image: false
    audio: false
    embedded_context: false
    mcp: true

  default_model: "gpt-4o"
  system_prompt: "You are a helpful coding assistant."
  coding_agent_mode: true

  permissions:
    default_policy: ask_always
    auto_allow:
      - read
      - search
      - think
    always_ask:
      - edit
      - delete
      - execute

  sessions:
    max_concurrent: 10
    idle_timeout: "1h"
    storage: "memory"

  mcp:
    max_connections_per_session: 5
    allowed_servers: []
    server_spawn_timeout: "10s"

Top-Level Options

Property Type Default Description
enabled bool false Enable/disable the ACP subsystem
default_model string none Override model selection for ACP sessions
system_prompt string none Inject a system prompt into all ACP requests
coding_agent_mode bool false Enable coding agent system prompt

Transport Options

Property Type Default Description
transport.stdio.enabled bool true Enable stdio transport

Permission Options

Property Type Default Description
permissions.default_policy enum ask_always Default policy: ask_always, allow_read, allow_all
permissions.auto_allow list [read, search, think] Tool kinds auto-allowed without asking
permissions.always_ask list [edit, delete, execute] Tool kinds that always require permission

Session Options

Property Type Default Description
sessions.max_concurrent int 10 Maximum concurrent sessions
sessions.idle_timeout string "1h" Idle timeout before session cleanup
sessions.storage string "memory" Storage backend: memory or file
sessions.storage_path string none Path for file-based storage

MCP Bridge Options

Property Type Default Description
mcp.max_connections_per_session int 5 Max MCP connections per session
mcp.allowed_servers list [] Allowed server IDs (empty = all)
mcp.server_spawn_timeout string "10s" Timeout for spawning MCP server processes

See ACP Architecture for protocol details and ACP Usage Guide for practical examples.