Skip to content

Admin REST API Reference

This document provides a comprehensive guide for developers building configuration control applications using Continuum Router's Admin REST API. The Configuration Management API enables runtime configuration viewing, modification, and management without server restarts.

Table of Contents


Overview

The Admin REST API provides programmatic access to Continuum Router's configuration system, enabling:

  • Real-time Configuration Viewing: Retrieve current configuration with automatic sensitive data masking
  • Dynamic Configuration Updates: Modify configuration sections without server restart
  • Configuration Versioning: Track changes with full history and rollback capabilities
  • Backend Management: Add, remove, and modify backends dynamically
  • Export/Import: Save and restore configurations in multiple formats (YAML, JSON, TOML)

Key Features

Feature Description
Hot Reload Changes applied immediately or gradually based on section type
Sensitive Masking API keys, passwords, and tokens automatically masked in responses
Validation All changes validated before application with dry-run support
Audit Logging All modifications logged for security and compliance
History Tracking Up to 100 configuration versions maintained for rollback

Authentication

All Admin API endpoints require authentication via the Admin Auth system.

Authentication Methods

1. Bearer Token

Authorization: Bearer <admin-token>
curl -H "Authorization: Bearer your-admin-token" \
  http://localhost:8080/admin/config/full

2. Basic Authentication

Authorization: Basic <base64(username:password)>
curl -u admin:password http://localhost:8080/admin/config/full

3. API Key Header

X-API-Key: <admin-api-key>
curl -H "X-API-Key: your-admin-key" http://localhost:8080/admin/config/full

Configuration

Configure admin authentication in config.yaml:

admin:
  auth:
    method: bearer_token  # Options: none, bearer_token, basic, api_key
    token: "${ADMIN_TOKEN}"  # Environment variable supported
    # For basic auth:
    # username: admin
    # password: "${ADMIN_PASSWORD}"

  # IP whitelist (optional)
  ip_whitelist:
        - "127.0.0.1"
        - "10.0.0.0/8"

  # Configurable limits
  max_history_entries: 100
  max_backend_name_length: 256

Base URL and Headers

Base URL

http://localhost:8080/admin

Common Request Headers

Content-Type: application/json
Accept: application/json
Authorization: Bearer <token>

Common Response Headers

Content-Type: application/json
X-Request-Id: <unique-request-id>

Configuration Query APIs

Get Full Configuration

Retrieve the complete configuration with sensitive information masked.

GET /admin/config/full

Response

{
  "config": {
    "server": {
      "bind_address": "0.0.0.0:8080",
      "workers": 4
    },
    "backends": [
      {
        "name": "openai",
        "url": "https://api.openai.com",
        "api_key": "sk-***abcd",
        "weight": 1
      }
    ],
    "logging": {
      "level": "info"
    },
    "rate_limiting": {
      "enabled": true,
      "requests_per_minute": 100
    }
  },
  "hot_reload_enabled": true,
  "last_modified": "2025-12-13T10:30:00Z"
}

Example

curl -s http://localhost:8080/admin/config/full \
  -H "Authorization: Bearer $ADMIN_TOKEN" | jq

List Configuration Sections

Get all available configuration sections with their hot reload capabilities.

GET /admin/config/sections

Response

{
  "sections": [
    {
      "name": "server",
      "description": "Server configuration including bind address and workers",
      "hot_reload_capability": "requires_restart"
    },
    {
      "name": "backends",
      "description": "Backend server configurations",
      "hot_reload_capability": "gradual"
    },
    {
      "name": "logging",
      "description": "Logging configuration",
      "hot_reload_capability": "immediate"
    },
    {
      "name": "rate_limiting",
      "description": "Rate limiting configuration",
      "hot_reload_capability": "immediate"
    },
    {
      "name": "circuit_breaker",
      "description": "Circuit breaker configuration",
      "hot_reload_capability": "immediate"
    },
    {
      "name": "retry",
      "description": "Retry policy configuration",
      "hot_reload_capability": "immediate"
    },
    {
      "name": "timeouts",
      "description": "Timeout configuration",
      "hot_reload_capability": "gradual"
    },
    {
      "name": "health_checks",
      "description": "Health check configuration",
      "hot_reload_capability": "gradual"
    },
    {
      "name": "global_prompts",
      "description": "Global prompt injection configuration",
      "hot_reload_capability": "immediate"
    },
    {
      "name": "fallback",
      "description": "Model fallback configuration",
      "hot_reload_capability": "gradual"
    },
    {
      "name": "files",
      "description": "Files API configuration",
      "hot_reload_capability": "gradual"
    },
    {
      "name": "api_keys",
      "description": "API keys configuration",
      "hot_reload_capability": "immediate"
    },
    {
      "name": "metrics",
      "description": "Metrics and monitoring configuration",
      "hot_reload_capability": "gradual"
    },
    {
      "name": "admin",
      "description": "Admin API configuration",
      "hot_reload_capability": "gradual"
    },
    {
      "name": "routing",
      "description": "Request routing configuration",
      "hot_reload_capability": "gradual"
    }
  ]
}

Example

curl -s http://localhost:8080/admin/config/sections \
  -H "Authorization: Bearer $ADMIN_TOKEN" | jq '.sections[].name'

Get Section Configuration

Retrieve configuration for a specific section.

GET /admin/config/{section}

Path Parameters

Parameter Type Required Description
section string Yes Section name (see list above)

Response

{
  "section": "logging",
  "config": {
    "level": "info",
    "format": "json",
    "file": "/var/log/continuum-router.log"
  },
  "hot_reload_capability": "immediate",
  "description": "Logging configuration"
}

Example

# Get logging configuration
curl -s http://localhost:8080/admin/config/logging \
  -H "Authorization: Bearer $ADMIN_TOKEN" | jq

# Get backends configuration
curl -s http://localhost:8080/admin/config/backends \
  -H "Authorization: Bearer $ADMIN_TOKEN" | jq

Get Configuration Schema

Retrieve JSON Schema for configuration validation.

GET /admin/config/schema

Query Parameters

Parameter Type Required Description
section string No Get schema for specific section only

Response

{
  "schema": {
    "$schema": "http://json-schema.org/draft-07/schema#",
    "type": "object",
    "properties": {
      "server": {
        "type": "object",
        "properties": {
          "bind_address": {
            "type": "string",
            "pattern": "^[^:]+:[0-9]+$",
            "description": "Server bind address in host:port format"
          },
          "workers": {
            "type": "integer",
            "minimum": 1,
            "description": "Number of worker threads"
          }
        }
      },
      "logging": {
        "type": "object",
        "properties": {
          "level": {
            "type": "string",
            "enum": ["trace", "debug", "info", "warn", "error"]
          }
        }
      }
    }
  }
}

Example

# Get full schema
curl -s http://localhost:8080/admin/config/schema \
  -H "Authorization: Bearer $ADMIN_TOKEN" | jq

# Get schema for specific section
curl -s "http://localhost:8080/admin/config/schema?section=logging" \
  -H "Authorization: Bearer $ADMIN_TOKEN" | jq

Configuration Modification APIs

Replace Section Configuration

Replace entire section configuration with new values.

PUT /admin/config/{section}

Request Body

{
  "config": {
    "level": "debug",
    "format": "json"
  }
}

Response

{
  "success": true,
  "message": "Configuration updated successfully",
  "version": 5,
  "hot_reload_capability": "immediate",
  "applied": true,
  "warnings": []
}

Example

# Update logging level to debug
curl -X PUT http://localhost:8080/admin/config/logging \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "config": {
      "level": "debug"
    }
  }'

Partial Update Section

Apply partial updates using JSON merge patch semantics.

PATCH /admin/config/{section}

Request Body

{
  "config": {
    "level": "warn"
  }
}

Only specified fields are updated; other fields remain unchanged.

Response

{
  "success": true,
  "message": "Configuration partially updated",
  "version": 6,
  "hot_reload_capability": "immediate",
  "applied": true,
  "merged_config": {
    "level": "warn",
    "format": "json",
    "file": "/var/log/continuum-router.log"
  }
}

Example

# Update only rate limit value
curl -X PATCH http://localhost:8080/admin/config/rate_limiting \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "config": {
      "requests_per_minute": 200
    }
  }'

Validate Configuration

Validate configuration changes without applying them.

POST /admin/config/validate

Request Body

{
  "section": "server",
  "config": {
    "bind_address": "0.0.0.0:9090",
    "workers": 8
  },
  "dry_run": true
}

Response (Valid)

{
  "valid": true,
  "errors": [],
  "warnings": [
    {
      "field": "bind_address",
      "message": "Changing bind_address requires server restart"
    }
  ],
  "hot_reload_capability": "requires_restart"
}

Response (Invalid)

{
  "valid": false,
  "errors": [
    {
      "field": "workers",
      "message": "workers must be greater than 0",
      "code": "VALIDATION_ERROR"
    }
  ],
  "warnings": []
}

Example

# Validate before applying
curl -X POST http://localhost:8080/admin/config/validate \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "section": "rate_limiting",
    "config": {
      "enabled": true,
      "requests_per_minute": 500
    }
  }'

Apply Configuration

Apply pending configuration changes immediately (trigger hot reload).

POST /admin/config/apply

Request Body

{
  "sections": ["logging", "rate_limiting"],
  "force": false
}
Field Type Required Description
sections array No Specific sections to apply (default: all pending)
force boolean No Force apply even with warnings (default: false)

Response

{
  "success": true,
  "applied_sections": ["logging", "rate_limiting"],
  "version": 7,
  "results": {
    "logging": {
      "status": "applied",
      "hot_reload_type": "immediate"
    },
    "rate_limiting": {
      "status": "applied",
      "hot_reload_type": "immediate"
    }
  }
}

Example

curl -X POST http://localhost:8080/admin/config/apply \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "sections": ["logging"]
  }'

Configuration Save/Restore APIs

Export Configuration

Export current configuration in specified format.

POST /admin/config/export

Request Body

{
  "format": "yaml",
  "sections": ["server", "backends", "logging"],
  "include_sensitive": false,
  "include_defaults": true
}
Field Type Required Description
format string Yes Output format: yaml, json, or toml
sections array No Sections to export (default: all)
include_sensitive boolean No Include unmasked sensitive data (default: false)
include_defaults boolean No Include default values (default: true)

Response

{
  "format": "yaml",
  "content": "server:\n  bind_address: \"0.0.0.0:8080\"\n  workers: 4\n\nbackends:\n  - name: openai\n    url: https://api.openai.com\n    api_key: \"sk-***abcd\"\n",
  "exported_at": "2025-12-13T10:30:00Z",
  "sections_exported": ["server", "backends", "logging"]
}

Example

# Export as YAML
curl -X POST http://localhost:8080/admin/config/export \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"format": "yaml"}' | jq -r '.content' > config-backup.yaml

# Export as JSON
curl -X POST http://localhost:8080/admin/config/export \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"format": "json"}' | jq -r '.content' > config-backup.json

# Export specific sections
curl -X POST http://localhost:8080/admin/config/export \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "format": "yaml",
    "sections": ["backends", "rate_limiting"]
  }'

Import Configuration

Import and apply configuration from content.

POST /admin/config/import

Request Body

{
  "format": "yaml",
  "content": "logging:\n  level: info\n  format: json\n",
  "apply": true,
  "dry_run": false,
  "merge": true
}
Field Type Required Description
format string Yes Content format: yaml, json, or toml
content string Yes Configuration content (max 1MB)
apply boolean No Apply after validation (default: true)
dry_run boolean No Validate only without applying (default: false)
merge boolean No Merge with existing config (default: false)

Response

{
  "success": true,
  "message": "Configuration imported and applied",
  "version": 8,
  "validation": {
    "valid": true,
    "errors": [],
    "warnings": []
  },
  "sections_imported": ["logging"],
  "applied": true
}

Example

# Import from file
curl -X POST http://localhost:8080/admin/config/import \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d "{
    \"format\": \"yaml\",
    \"content\": $(cat config-backup.yaml | jq -Rs .),
    \"apply\": true
  }"

# Dry run import
curl -X POST http://localhost:8080/admin/config/import \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "format": "yaml",
    "content": "logging:\n  level: debug\n",
    "dry_run": true
  }'

Get Configuration History

View configuration change history.

GET /admin/config/history

Query Parameters

Parameter Type Required Description
limit integer No Number of entries to return (default: 20, max: 100)
offset integer No Number of entries to skip (default: 0)
section string No Filter by section name

Response

{
  "history": [
    {
      "version": 8,
      "timestamp": "2025-12-13T10:30:00Z",
      "sections_changed": ["logging"],
      "source": "api",
      "user": "admin",
      "description": "Updated logging level to debug",
      "rollback_available": true
    },
    {
      "version": 7,
      "timestamp": "2025-12-13T10:25:00Z",
      "sections_changed": ["rate_limiting"],
      "source": "api",
      "user": "admin",
      "description": "Increased rate limit to 200 rpm",
      "rollback_available": true
    },
    {
      "version": 6,
      "timestamp": "2025-12-13T09:00:00Z",
      "sections_changed": ["backends"],
      "source": "file_reload",
      "user": "system",
      "description": "Configuration file changed",
      "rollback_available": true
    }
  ],
  "total_entries": 8,
  "current_version": 8
}

Example

# Get recent history
curl -s http://localhost:8080/admin/config/history \
  -H "Authorization: Bearer $ADMIN_TOKEN" | jq

# Get history for specific section
curl -s "http://localhost:8080/admin/config/history?section=backends&limit=10" \
  -H "Authorization: Bearer $ADMIN_TOKEN" | jq

Rollback Configuration

Rollback to a previous configuration version.

POST /admin/config/rollback/{version}

Path Parameters

Parameter Type Required Description
version integer Yes Version number to rollback to

Request Body

{
  "sections": ["logging", "rate_limiting"],
  "dry_run": false
}
Field Type Required Description
sections array No Specific sections to rollback (default: all changed)
dry_run boolean No Preview without applying (default: false)

Response

{
  "success": true,
  "message": "Rolled back to version 5",
  "previous_version": 8,
  "new_version": 9,
  "sections_rolled_back": ["logging", "rate_limiting"],
  "changes": {
    "logging": {
      "level": {
        "from": "debug",
        "to": "info"
      }
    }
  }
}

Example

# Rollback to version 5
curl -X POST http://localhost:8080/admin/config/rollback/5 \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{}'

# Preview rollback (dry run)
curl -X POST http://localhost:8080/admin/config/rollback/5 \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"dry_run": true}'

Backend Management APIs

Add Backend

Add a new backend dynamically.

POST /admin/backends

Request Body

{
  "name": "new-ollama",
  "url": "http://192.168.1.100:11434",
  "weight": 1,
  "models": ["llama3.2", "mistral"],
  "api_key": "optional-key",
  "enabled": true,
  "health_check": {
    "enabled": true,
    "path": "/v1/models"
  }
}
Field Type Required Description
name string Yes Unique backend name (alphanumeric, -, _)
type string No Backend type: openai, azure, vllm, ollama, anthropic, gemini, llamacpp, generic. Default: generic (auto-detect)
url string Yes Backend URL (http:// or https://)
weight integer No Load balancing weight (default: 1)
models array No List of models served by this backend
api_key string No API key for backend authentication
enabled boolean No Whether backend is enabled (default: true)

Backend Type Auto-Detection

When type is not specified or set to generic, the router automatically probes the backend's /v1/models endpoint to detect the backend type. Currently supports auto-detection of:

  • llama.cpp: Identified by owned_by: "llamacpp" or llama.cpp-specific metadata fields

This allows seamless integration of llama.cpp backends without explicit type configuration:

# llama.cpp backend - type auto-detected
curl -X POST http://localhost:8080/admin/backends \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "local-llama",
    "url": "http://localhost:8080"
  }'

Response

{
  "success": true,
  "message": "Backend 'new-ollama' added successfully",
  "backend": {
    "name": "new-ollama",
    "url": "http://192.168.1.100:11434",
    "weight": 1,
    "models": ["llama3.2", "mistral"],
    "enabled": true,
    "health_status": "unknown"
  }
}

Example

curl -X POST http://localhost:8080/admin/backends \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "new-backend",
    "url": "http://192.168.1.100:11434",
    "weight": 2,
    "models": ["llama3.2"]
  }'

Get Backend

Get configuration for a specific backend.

GET /admin/backends/{name}

Response

{
  "name": "openai",
  "url": "https://api.openai.com",
  "api_key": "sk-***abcd",
  "weight": 1,
  "models": ["gpt-4", "gpt-3.5-turbo"],
  "enabled": true,
  "health_status": "healthy",
  "stats": {
    "total_requests": 1250,
    "failed_requests": 12,
    "average_latency_ms": 150,
    "last_used": "2025-12-13T10:29:55Z"
  }
}

Example

curl -s http://localhost:8080/admin/backends/openai \
  -H "Authorization: Bearer $ADMIN_TOKEN" | jq

Update Backend

Update backend configuration.

PUT /admin/backends/{name}

Request Body

{
  "url": "https://api.openai.com",
  "weight": 2,
  "models": ["gpt-4", "gpt-4-turbo", "gpt-3.5-turbo"],
  "enabled": true
}

Response

{
  "success": true,
  "message": "Backend 'openai' updated successfully",
  "backend": {
    "name": "openai",
    "url": "https://api.openai.com",
    "weight": 2,
    "models": ["gpt-4", "gpt-4-turbo", "gpt-3.5-turbo"],
    "enabled": true
  }
}

Example

curl -X PUT http://localhost:8080/admin/backends/openai \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "weight": 3,
    "models": ["gpt-4", "gpt-4-turbo"]
  }'

Delete Backend

Remove a backend from the router.

DELETE /admin/backends/{name}

Query Parameters

Parameter Type Required Description
force boolean No Force delete even if backend has active connections

Response

{
  "success": true,
  "message": "Backend 'old-backend' removed successfully",
  "removed_backend": "old-backend"
}

Notes

  • Deleting the last backend is allowed: The router can operate with zero backends configured. When the last backend is deleted:
    • /v1/models returns an empty list
    • Routing requests return 503 "No backends available"
    • New backends can be added via POST /admin/backends

Example

curl -X DELETE http://localhost:8080/admin/backends/old-backend \
  -H "Authorization: Bearer $ADMIN_TOKEN"

# Force delete
curl -X DELETE "http://localhost:8080/admin/backends/old-backend?force=true" \
  -H "Authorization: Bearer $ADMIN_TOKEN"

Update Backend Weight

Update only the backend weight for load balancing.

PUT /admin/backends/{name}/weight

Request Body

{
  "weight": 5
}

Response

{
  "success": true,
  "message": "Backend 'openai' weight updated to 5",
  "previous_weight": 2,
  "new_weight": 5
}

Example

curl -X PUT http://localhost:8080/admin/backends/openai/weight \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"weight": 5}'

Update Backend Models

Update the model list for a backend.

PUT /admin/backends/{name}/models

Request Body

{
  "models": ["gpt-4", "gpt-4-turbo", "gpt-4o", "gpt-3.5-turbo"],
  "append": false
}
Field Type Required Description
models array Yes List of model names
append boolean No Append to existing list (default: false, replaces)

Response

{
  "success": true,
  "message": "Backend 'openai' models updated",
  "models": ["gpt-4", "gpt-4-turbo", "gpt-4o", "gpt-3.5-turbo"]
}

Example

# Replace models
curl -X PUT http://localhost:8080/admin/backends/openai/models \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"models": ["gpt-4", "gpt-4o"]}'

# Append models
curl -X PUT http://localhost:8080/admin/backends/openai/models \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"models": ["gpt-4.5-turbo"], "append": true}'

Statistics APIs

The Statistics APIs expose aggregated request metrics collected by the StatsCollector. All four endpoints are mounted under /admin/stats and share the same authentication as the rest of the Admin API.

Stats collection is enabled by default. It can be configured or disabled via the admin.stats section in your YAML config:

admin:
  stats:
    enabled: true                # Enable/disable collection (default: true)
    retention_window: 24h        # Ring-buffer retention for windowed queries (default: 24h)
    token_tracking: true         # Parse response bodies for token usage (default: true)
    persistence:
      enabled: true              # Enable stats persistence across restarts (default: true)
      path: ./data/stats.json    # File path for the snapshot (default: ./data/stats.json)
      snapshot_interval: 5m      # How often to write periodic snapshots (default: 5m)
      max_age: 7d                # Discard snapshots older than this on startup (default: 7d)

The retention_window and token_tracking settings support hot-reload: changes are applied immediately without a restart.

Stats Persistence

When the persistence subsection is present and enabled is true, the router saves a statistics snapshot to disk periodically and restores it on startup. This ensures that request counters, per-model breakdowns, and the latency ring buffer survive restarts.

How it works:

  • On startup, the router reads the snapshot file and restores all counters and ring-buffer records. Uptime resets to zero on each restart.
  • A background task writes a new snapshot every snapshot_interval. Writes are atomic (temp file + rename) to prevent corruption.
  • On graceful shutdown (SIGTERM/SIGINT), a final snapshot is saved before the process exits.
  • If the snapshot file is missing, corrupted, or older than max_age, the router starts with fresh counters and logs a warning or info message.

Supported duration formats for snapshot_interval and max_age:

Format Example Meaning
Xs 30s 30 seconds
Xm 5m 5 minutes
Xh 1h 1 hour
Xd 7d 7 days

Set max_age to "0" or "" to disable staleness checks (always restore regardless of age).

Get Full Statistics

GET /admin/stats

Returns overall, per-model, and per-backend statistics.

Query Parameters

Parameter Type Description
window string Optional time window filter. Accepted formats: 30m, 1h, 24h, 7d. Omit for all-time totals.

Response

{
  "uptime_seconds": 3600,
  "window": "all",
  "overall": {
    "total_requests": 1500,
    "successful_requests": 1480,
    "failed_requests": 20,
    "avg_latency_ms": 145.3,
    "p50_latency_ms": 120.0,
    "p95_latency_ms": 380.0,
    "p99_latency_ms": 750.0,
    "total_prompt_tokens": 450000,
    "total_completion_tokens": 180000,
    "total_tokens": 630000,
    "tokens_per_sec_avg": 87.4
  },
  "models": [
    {
      "model_id": "gpt-4",
      "total_requests": 900,
      "successful_requests": 895,
      "failed_requests": 5,
      "total_prompt_tokens": 270000,
      "total_completion_tokens": 108000,
      "total_tokens": 378000,
      "avg_latency_ms": 160.2,
      "avg_tokens_per_sec": 92.1,
      "last_used": "2026-03-05T10:30:00Z"
    }
  ],
  "backends": [
    {
      "backend_name": "openai",
      "total_requests": 900,
      "successful_requests": 895,
      "failed_requests": 5,
      "avg_latency_ms": 160.2,
      "health_status": "healthy"
    }
  ]
}

Example

# All-time statistics
curl -s http://localhost:8080/admin/stats \
  -H "Authorization: Bearer $ADMIN_TOKEN" | jq

# Last hour only
curl -s "http://localhost:8080/admin/stats?window=1h" \
  -H "Authorization: Bearer $ADMIN_TOKEN" | jq

Get Per-Model Statistics

GET /admin/stats/models

Returns only the per-model breakdown (subset of the full stats response).

Response

{
  "models": [
    {
      "model_id": "gpt-4",
      "total_requests": 900,
      "successful_requests": 895,
      "failed_requests": 5,
      "total_prompt_tokens": 270000,
      "total_completion_tokens": 108000,
      "total_tokens": 378000,
      "avg_latency_ms": 160.2,
      "avg_tokens_per_sec": 92.1,
      "last_used": "2026-03-05T10:30:00Z"
    }
  ]
}

Models are sorted by total_requests in descending order.

Example

curl -s http://localhost:8080/admin/stats/models \
  -H "Authorization: Bearer $ADMIN_TOKEN" | jq '.models[].model_id'

Get Per-Backend Statistics

GET /admin/stats/backends

Returns only the per-backend breakdown. The health_status field is populated from the health checker ("healthy", "unhealthy", or "unknown" when health checks are disabled).

Response

{
  "backends": [
    {
      "backend_name": "openai",
      "total_requests": 900,
      "successful_requests": 895,
      "failed_requests": 5,
      "avg_latency_ms": 160.2,
      "health_status": "healthy"
    }
  ]
}

Backends are sorted by total_requests in descending order.

Example

curl -s http://localhost:8080/admin/stats/backends \
  -H "Authorization: Bearer $ADMIN_TOKEN" | jq

Reset Statistics

POST /admin/stats/reset

Resets all counters, per-model records, per-backend records, and the latency ring buffer. This action is irreversible.

Response

{
  "success": true,
  "action": "reset",
  "message": "Statistics counters have been reset"
}

Example

curl -X POST http://localhost:8080/admin/stats/reset \
  -H "Authorization: Bearer $ADMIN_TOKEN"

Response Cache Admin APIs

The Response Cache Admin APIs expose statistics and invalidation operations for the response cache. All endpoints are mounted under /admin/response-cache and require the same authentication as the rest of the Admin API.

Response caching is configured in the response_cache section of your YAML config. See the Response Cache Configuration guide for full configuration details.

Get Response Cache Statistics

GET /admin/response-cache/stats

Returns current response cache statistics including hit/miss counts, memory usage, and configuration summary.

Response

{
  "enabled": true,
  "backend_type": "memory",
  "entries": 42,
  "capacity": 1000,
  "requests": {
    "hit": 120,
    "miss": 80,
    "skip": 15,
    "total": 215
  },
  "hit_rate": "0.6000",
  "evictions": 3,
  "size_bytes": 1048576,
  "config": {
    "backend": "memory",
    "ttl": "5m",
    "capacity": 1000,
    "max_response_size": 1048576,
    "max_stream_buffer_size": 10485760
  }
}

When using the Redis backend (backend: redis), the response includes an additional redis object:

{
  "enabled": true,
  "backend_type": "redis",
  "entries": 42,
  "capacity": 1000,
  "requests": { "hit": 120, "miss": 80, "skip": 15, "total": 215 },
  "hit_rate": "0.6000",
  "evictions": 3,
  "size_bytes": 1048576,
  "config": { "backend": "redis", "ttl": "5m", "capacity": 1000, "max_response_size": 1048576, "max_stream_buffer_size": 10485760 },
  "redis": {
    "connections": { "active": 3, "idle": 5 },
    "errors": { "connection": 0, "timeout": 0, "other": 0, "total": 0 },
    "fallback_active": false
  }
}

When response caching is disabled (response_cache.enabled: false or the section is absent), enabled is false, entries and capacity are 0, and config is null.

Response Fields

Field Type Description
enabled boolean Whether response caching is active
backend_type string Active cache backend: "memory" or "redis"
entries integer Current number of cached entries
capacity integer Maximum cache capacity (LRU limit)
requests.hit integer Requests served from cache
requests.miss integer Cache misses (backend was called, entry stored)
requests.skip integer Non-cacheable requests (e.g., temperature > 0)
requests.total integer Total cacheable lookups (hit + miss + skip)
hit_rate string Rolling cache hit rate as a decimal string (e.g., "0.6000")
evictions integer Total LRU evictions since startup
size_bytes integer Approximate memory usage of cached entries in bytes
config object or null Active configuration summary; null when disabled
redis object or absent Redis-specific stats (only present when backend_type is "redis")
redis.connections.active integer Active connections in the Redis pool
redis.connections.idle integer Idle connections in the Redis pool
redis.errors.connection integer Redis connection errors since startup
redis.errors.timeout integer Redis command timeout errors since startup
redis.errors.other integer Other Redis errors since startup
redis.errors.total integer Total Redis errors since startup
redis.fallback_active boolean Whether the in-memory fallback is currently active

Example

curl -s http://localhost:8080/admin/response-cache/stats \
  -H "Authorization: Bearer $ADMIN_TOKEN" | jq

Invalidate Response Cache

POST /admin/response-cache/invalidate

Clears cache entries. Currently supports full cache invalidation via clear_all: true. Targeted invalidation by model or tenant is reserved for a future release.

Request Body

{
  "clear_all": true,
  "model": "gpt-4",
  "tenant_id": "tenant-abc"
}
Field Type Required Description
clear_all boolean No When true, clears the entire cache. Defaults to false.
model string No Reserved for future targeted invalidation. Must not exceed 256 characters.
tenant_id string No Reserved for future targeted invalidation. Must not exceed 256 characters.

Response (clear_all: true)

{
  "success": true,
  "action": "clear_all",
  "cleared_entries": 42
}

Response (clear_all: false or omitted)

{
  "success": true,
  "action": "noop",
  "message": "Targeted invalidation by model/tenant_id is not yet supported. Use clear_all: true to clear the entire cache."
}

Response (cache disabled)

{
  "success": false,
  "error": "Response cache is not enabled"
}

Example

# Clear entire cache
curl -X POST http://localhost:8080/admin/response-cache/invalidate \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"clear_all": true}'

KV Cache Index Admin APIs

The KV Cache Index Admin APIs expose statistics, per-backend state, and a clear operation for the KV cache index subsystem. All endpoints are mounted under /admin/kv-index and require the same authentication as the rest of the Admin API.

The KV cache index tracks which backends hold cached KV data for specific token prefixes, enabling KV-aware routing. It is configured in the kv_cache_index section of your YAML config.

Get KV Cache Index Statistics

GET /admin/kv-index/stats

Returns overall KV cache index statistics, including index size, event source connection status, and routing decision counts.

Response

{
  "enabled": true,
  "config": {
    "backend": "memory",
    "max_entries": 100000,
    "entry_ttl_seconds": 600,
    "event_sources_count": 2,
    "scoring": {
      "overlap_weight": 0.6,
      "load_weight": 0.3,
      "health_weight": 0.1,
      "min_overlap_threshold": 0.3
    }
  },
  "index": {
    "prefix_count": 45,
    "entry_count": 120,
    "total_hits": 3842,
    "total_evictions": 12
  },
  "event_sources": [
    {
      "backend_name": "vllm-1",
      "connected": true,
      "events_received": 2100,
      "events_dropped": 0,
      "last_event_at": "2025-03-12T10:45:00Z",
      "reconnect_count": 0
    }
  ],
  "routing_decisions": {
    "kv_aware": 980,
    "fallback": 120,
    "total": 1100
  },
  "query_latency_count": 1100,
  "overlap_score_count": 980
}

When the KV cache index is disabled (kv_cache_index.enabled: false or the section is absent), enabled is false, config is null, and all counters are 0.

Response Fields

Field Type Description
enabled boolean Whether the KV cache index is active
config object or null Active configuration summary; null when disabled
config.backend string Index backend: "memory" or "redis"
config.max_entries integer Maximum tracked prefix hash entries
config.entry_ttl_seconds integer TTL for index entries in seconds
config.event_sources_count integer Number of configured event sources
config.scoring object Scoring weight configuration
index.prefix_count integer Number of distinct prefix hashes tracked
index.entry_count integer Total (prefix, backend) pairs tracked
index.total_hits integer Total cache hit recordings since startup
index.total_evictions integer Total cache eviction recordings since startup
event_sources array Status of each event source consumer
event_sources[].connected boolean Whether the consumer is currently connected
event_sources[].events_received integer Total events received from this source
event_sources[].events_dropped integer Events dropped due to backpressure
event_sources[].reconnect_count integer Number of reconnect attempts since startup
routing_decisions.kv_aware integer Requests routed using KV-aware selection
routing_decisions.fallback integer Requests that fell back to the default strategy
routing_decisions.total integer Total routing decisions made

Example

curl -s http://localhost:8080/admin/kv-index/stats \
  -H "Authorization: Bearer $ADMIN_TOKEN" | jq

Get Per-Backend KV Cache State

GET /admin/kv-index/backends

Returns per-backend KV cache event statistics, including events received, processed, dropped, connection status, and index event counts.

Response (enabled)

{
  "enabled": true,
  "backends": [
    {
      "backend_name": "vllm-1",
      "connection": {
        "connected": true,
        "reconnect_count": 0,
        "last_event_at": "2025-03-12T10:45:00Z"
      },
      "events": {
        "received": 2100,
        "dropped": 0,
        "index_created": 1950,
        "index_evicted": 150
      }
    },
    {
      "backend_name": "vllm-2",
      "connection": {
        "connected": false,
        "reconnect_count": 3,
        "last_event_at": null
      },
      "events": {
        "received": 0,
        "dropped": 0,
        "index_created": 0,
        "index_evicted": 0
      },
      "configured_endpoint": "ws://vllm-2:8000/v1/kv_events"
    }
  ]
}

Backends that appear in kv_cache_index.event_sources but have no active consumer yet are included with connected: false and a configured_endpoint field.

Response (disabled)

{
  "enabled": false,
  "backends": []
}

Response Fields

Field Type Description
enabled boolean Whether the KV cache index is active
backends[].backend_name string Backend identifier
backends[].connection.connected boolean Whether the event stream consumer is connected
backends[].connection.reconnect_count integer Reconnect attempts since startup
backends[].connection.last_event_at string or null ISO 8601 timestamp of the most recent event
backends[].events.received integer Total events received from this backend
backends[].events.dropped integer Events dropped due to backpressure
backends[].events.index_created integer Index entries created from events
backends[].events.index_evicted integer Index entries evicted from events
backends[].configured_endpoint string Configured endpoint URL (only present for inactive sources)

Example

curl -s http://localhost:8080/admin/kv-index/backends \
  -H "Authorization: Bearer $ADMIN_TOKEN" | jq

Clear KV Cache Index

POST /admin/kv-index/clear

Clears all entries from the KV cache index. Intended for debugging and testing. In production the index rebuilds automatically from incoming KV events.

Response (success)

{
  "success": true,
  "entries_before_clear": 120,
  "cleared_entries": 45
}

entries_before_clear is the total (prefix, backend) pair count before clearing. cleared_entries is the number of prefix hash buckets removed. For the Redis backend, cleared_entries counts the number of Redis keys deleted; because each key has a TTL, any remaining keys expire automatically.

Response (disabled)

{
  "success": false,
  "error": "KV cache index is not enabled"
}

Example

curl -X POST http://localhost:8080/admin/kv-index/clear \
  -H "Authorization: Bearer $ADMIN_TOKEN"

Data Models

Configuration Sections

Section Description Hot Reload
server Bind address, workers, connection pool Requires restart
backends Backend URLs, weights, models Gradual
health_checks Intervals, thresholds Gradual
logging Log level, format, output Immediate
retry Max attempts, delays, backoff Immediate
timeouts Connect, request, idle timeouts Gradual
rate_limiting Limits, storage, whitelist Immediate
circuit_breaker Thresholds, recovery time Immediate
global_prompts System prompt injection Immediate
fallback Fallback chains, policies Gradual
files Files API settings Gradual
api_keys API key configuration Immediate
metrics Prometheus, labels Gradual
admin Admin API settings Gradual
admin.stats Stats collection settings Immediate
routing Model routing rules Gradual

Backend Object

{
  "name": "string",
  "url": "string (http:// or https://)",
  "api_key": "string (optional, masked in responses)",
  "weight": "integer (1-100)",
  "models": ["string"],
  "enabled": "boolean",
  "health_check": {
    "enabled": "boolean",
    "path": "string",
    "interval": "string (duration)"
  }
}

History Entry Object

{
  "version": "integer",
  "timestamp": "string (ISO 8601)",
  "sections_changed": ["string"],
  "source": "string (api|file_reload|initial|rollback)",
  "user": "string",
  "description": "string (optional)",
  "rollback_available": "boolean"
}

Validation Result Object

{
  "valid": "boolean",
  "errors": [
    {
      "field": "string",
      "message": "string",
      "code": "string"
    }
  ],
  "warnings": [
    {
      "field": "string",
      "message": "string"
    }
  ]
}

Hot Reload Behavior

Update Types

Type Behavior Sections
Immediate Applied instantly, no disruption logging, ratelimiting, circuitbreaker, retry, globalprompts, apikeys
Gradual Existing connections maintained, new connections use new config backends, health_checks, timeouts, fallback, files, metrics, admin, routing
Requires Restart Logged as warning, requires server restart server.bind_address, server.workers

Example Workflow

# 1. Check current configuration
curl -s http://localhost:8080/admin/config/logging | jq

# 2. Validate change
curl -X POST http://localhost:8080/admin/config/validate \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"section": "logging", "config": {"level": "debug"}}'

# 3. Apply change (immediate effect)
curl -X PATCH http://localhost:8080/admin/config/logging \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"config": {"level": "debug"}}'

# 4. Verify change
curl -s http://localhost:8080/admin/config/logging | jq '.config.level'

Error Handling

Error Response Format

{
  "error_code": "string",
  "message": "string",
  "details": {}
}

Error Codes

Code HTTP Status Description
VALIDATION_ERROR 400 Configuration validation failed
INVALID_SECTION 400 Unknown configuration section
PARSE_ERROR 400 Failed to parse configuration content
SECTION_NOT_FOUND 404 Section not found
VERSION_NOT_FOUND 404 History version not found
BACKEND_NOT_FOUND 404 Backend not found
BACKEND_EXISTS 409 Backend with name already exists
CONTENT_TOO_LARGE 413 Configuration content exceeds 1MB limit
INTERNAL_ERROR 500 Internal server error

Error Examples

// Validation Error
{
  "error_code": "VALIDATION_ERROR",
  "message": "Configuration validation failed",
  "details": {
    "errors": [
      {"field": "workers", "message": "workers must be greater than 0"}
    ]
  }
}

// Section Not Found
{
  "error_code": "SECTION_NOT_FOUND",
  "message": "Configuration section 'invalid' not found",
  "details": {
    "available_sections": ["server", "backends", "logging", "..."]
  }
}

// Backend Exists
{
  "error_code": "BACKEND_EXISTS",
  "message": "Backend 'openai' already exists",
  "details": {
    "existing_backend": "openai"
  }
}

Client SDK Examples

Python

import requests
from typing import Optional, Dict, Any, List
from dataclasses import dataclass


@dataclass
class ContinuumAdminClient:
    """Continuum Router Admin API Client"""

    base_url: str
    token: str

    def __post_init__(self):
        self.session = requests.Session()
        self.session.headers.update({
            "Authorization": f"Bearer {self.token}",
            "Content-Type": "application/json"
        })

    # Configuration Query APIs

    def get_full_config(self) -> Dict[str, Any]:
        """Get full configuration with masked sensitive data"""
        resp = self.session.get(f"{self.base_url}/admin/config/full")
        resp.raise_for_status()
        return resp.json()

    def get_sections(self) -> List[Dict[str, Any]]:
        """Get all configuration sections"""
        resp = self.session.get(f"{self.base_url}/admin/config/sections")
        resp.raise_for_status()
        return resp.json()["sections"]

    def get_section(self, section: str) -> Dict[str, Any]:
        """Get configuration for a specific section"""
        resp = self.session.get(f"{self.base_url}/admin/config/{section}")
        resp.raise_for_status()
        return resp.json()

    def get_schema(self, section: Optional[str] = None) -> Dict[str, Any]:
        """Get JSON schema for validation"""
        params = {"section": section} if section else {}
        resp = self.session.get(
            f"{self.base_url}/admin/config/schema",
            params=params
        )
        resp.raise_for_status()
        return resp.json()

    # Configuration Modification APIs

    def update_section(self, section: str, config: Dict[str, Any]) -> Dict[str, Any]:
        """Replace section configuration"""
        resp = self.session.put(
            f"{self.base_url}/admin/config/{section}",
            json={"config": config}
        )
        resp.raise_for_status()
        return resp.json()

    def patch_section(self, section: str, config: Dict[str, Any]) -> Dict[str, Any]:
        """Partial update section configuration"""
        resp = self.session.patch(
            f"{self.base_url}/admin/config/{section}",
            json={"config": config}
        )
        resp.raise_for_status()
        return resp.json()

    def validate_config(
        self,
        section: str,
        config: Dict[str, Any],
        dry_run: bool = True
    ) -> Dict[str, Any]:
        """Validate configuration without applying"""
        resp = self.session.post(
            f"{self.base_url}/admin/config/validate",
            json={"section": section, "config": config, "dry_run": dry_run}
        )
        resp.raise_for_status()
        return resp.json()

    def apply_config(
        self,
        sections: Optional[List[str]] = None,
        force: bool = False
    ) -> Dict[str, Any]:
        """Apply pending configuration changes"""
        body = {"force": force}
        if sections:
            body["sections"] = sections
        resp = self.session.post(
            f"{self.base_url}/admin/config/apply",
            json=body
        )
        resp.raise_for_status()
        return resp.json()

    # Configuration Save/Restore APIs

    def export_config(
        self,
        format: str = "yaml",
        sections: Optional[List[str]] = None,
        include_sensitive: bool = False
    ) -> str:
        """Export configuration in specified format"""
        body = {"format": format, "include_sensitive": include_sensitive}
        if sections:
            body["sections"] = sections
        resp = self.session.post(
            f"{self.base_url}/admin/config/export",
            json=body
        )
        resp.raise_for_status()
        return resp.json()["content"]

    def import_config(
        self,
        content: str,
        format: str = "yaml",
        apply: bool = True,
        dry_run: bool = False
    ) -> Dict[str, Any]:
        """Import configuration from content"""
        resp = self.session.post(
            f"{self.base_url}/admin/config/import",
            json={
                "format": format,
                "content": content,
                "apply": apply,
                "dry_run": dry_run
            }
        )
        resp.raise_for_status()
        return resp.json()

    def get_history(
        self,
        limit: int = 20,
        offset: int = 0,
        section: Optional[str] = None
    ) -> Dict[str, Any]:
        """Get configuration change history"""
        params = {"limit": limit, "offset": offset}
        if section:
            params["section"] = section
        resp = self.session.get(
            f"{self.base_url}/admin/config/history",
            params=params
        )
        resp.raise_for_status()
        return resp.json()

    def rollback(
        self,
        version: int,
        sections: Optional[List[str]] = None,
        dry_run: bool = False
    ) -> Dict[str, Any]:
        """Rollback to a previous version"""
        body = {"dry_run": dry_run}
        if sections:
            body["sections"] = sections
        resp = self.session.post(
            f"{self.base_url}/admin/config/rollback/{version}",
            json=body
        )
        resp.raise_for_status()
        return resp.json()

    # Backend Management APIs

    def list_backends(self) -> List[Dict[str, Any]]:
        """List all backends"""
        resp = self.session.get(f"{self.base_url}/admin/backends")
        resp.raise_for_status()
        return resp.json()["backends"]

    def get_backend(self, name: str) -> Dict[str, Any]:
        """Get backend configuration"""
        resp = self.session.get(f"{self.base_url}/admin/backends/{name}")
        resp.raise_for_status()
        return resp.json()

    def add_backend(
        self,
        name: str,
        url: str,
        weight: int = 1,
        models: Optional[List[str]] = None
    ) -> Dict[str, Any]:
        """Add a new backend"""
        body = {"name": name, "url": url, "weight": weight}
        if models:
            body["models"] = models
        resp = self.session.post(
            f"{self.base_url}/admin/backends",
            json=body
        )
        resp.raise_for_status()
        return resp.json()

    def update_backend(self, name: str, **kwargs) -> Dict[str, Any]:
        """Update backend configuration"""
        resp = self.session.put(
            f"{self.base_url}/admin/backends/{name}",
            json=kwargs
        )
        resp.raise_for_status()
        return resp.json()

    def delete_backend(self, name: str, force: bool = False) -> Dict[str, Any]:
        """Delete a backend"""
        params = {"force": str(force).lower()} if force else {}
        resp = self.session.delete(
            f"{self.base_url}/admin/backends/{name}",
            params=params
        )
        resp.raise_for_status()
        return resp.json()

    def update_backend_weight(self, name: str, weight: int) -> Dict[str, Any]:
        """Update backend weight"""
        resp = self.session.put(
            f"{self.base_url}/admin/backends/{name}/weight",
            json={"weight": weight}
        )
        resp.raise_for_status()
        return resp.json()

    def update_backend_models(
        self,
        name: str,
        models: List[str],
        append: bool = False
    ) -> Dict[str, Any]:
        """Update backend models"""
        resp = self.session.put(
            f"{self.base_url}/admin/backends/{name}/models",
            json={"models": models, "append": append}
        )
        resp.raise_for_status()
        return resp.json()


# Usage Example
if __name__ == "__main__":
    client = ContinuumAdminClient(
        base_url="http://localhost:8080",
        token="your-admin-token"
    )

    # Get current logging config
    logging_config = client.get_section("logging")
    print(f"Current log level: {logging_config['config']['level']}")

    # Update logging level
    result = client.patch_section("logging", {"level": "debug"})
    print(f"Updated: {result['success']}")

    # Add a new backend
    client.add_backend(
        name="new-ollama",
        url="http://192.168.1.100:11434",
        weight=2,
        models=["llama3.2", "mistral"]
    )

    # Export configuration backup
    backup = client.export_config(format="yaml")
    with open("config-backup.yaml", "w") as f:
        f.write(backup)

JavaScript/TypeScript

interface ConfigSection {
  name: string;
  config: Record<string, any>;
  hot_reload_capability: 'immediate' | 'gradual' | 'requires_restart';
}

interface HistoryEntry {
  version: number;
  timestamp: string;
  sections_changed: string[];
  source: string;
  user: string;
}

interface Backend {
  name: string;
  url: string;
  weight: number;
  models: string[];
  enabled: boolean;
  health_status: string;
}

class ContinuumAdminClient {
  private baseUrl: string;
  private token: string;

  constructor(baseUrl: string, token: string) {
    this.baseUrl = baseUrl;
    this.token = token;
  }

  private async request<T>(
    method: string,
    path: string,
    body?: any,
    params?: Record<string, string>
  ): Promise<T> {
    const url = new URL(`${this.baseUrl}${path}`);
    if (params) {
      Object.entries(params).forEach(([k, v]) => url.searchParams.set(k, v));
    }

    const response = await fetch(url.toString(), {
      method,
      headers: {
        'Authorization': `Bearer ${this.token}`,
        'Content-Type': 'application/json',
      },
      body: body ? JSON.stringify(body) : undefined,
    });

    if (!response.ok) {
      const error = await response.json();
      throw new Error(error.message || `HTTP ${response.status}`);
    }

    return response.json();
  }

  // Configuration Query APIs

  async getFullConfig(): Promise<any> {
    return this.request('GET', '/admin/config/full');
  }

  async getSections(): Promise<ConfigSection[]> {
    const result = await this.request<{ sections: ConfigSection[] }>(
      'GET', '/admin/config/sections'
    );
    return result.sections;
  }

  async getSection(section: string): Promise<ConfigSection> {
    return this.request('GET', `/admin/config/${section}`);
  }

  async getSchema(section?: string): Promise<any> {
    const params = section ? { section } : undefined;
    return this.request('GET', '/admin/config/schema', undefined, params);
  }

  // Configuration Modification APIs

  async updateSection(section: string, config: Record<string, any>): Promise<any> {
    return this.request('PUT', `/admin/config/${section}`, { config });
  }

  async patchSection(section: string, config: Record<string, any>): Promise<any> {
    return this.request('PATCH', `/admin/config/${section}`, { config });
  }

  async validateConfig(
    section: string,
    config: Record<string, any>,
    dryRun: boolean = true
  ): Promise<any> {
    return this.request('POST', '/admin/config/validate', {
      section,
      config,
      dry_run: dryRun,
    });
  }

  async applyConfig(sections?: string[], force: boolean = false): Promise<any> {
    return this.request('POST', '/admin/config/apply', { sections, force });
  }

  // Configuration Save/Restore APIs

  async exportConfig(
    format: 'yaml' | 'json' | 'toml' = 'yaml',
    sections?: string[],
    includeSensitive: boolean = false
  ): Promise<string> {
    const result = await this.request<{ content: string }>(
      'POST', '/admin/config/export',
      { format, sections, include_sensitive: includeSensitive }
    );
    return result.content;
  }

  async importConfig(
    content: string,
    format: 'yaml' | 'json' | 'toml' = 'yaml',
    apply: boolean = true,
    dryRun: boolean = false
  ): Promise<any> {
    return this.request('POST', '/admin/config/import', {
      format,
      content,
      apply,
      dry_run: dryRun,
    });
  }

  async getHistory(
    limit: number = 20,
    offset: number = 0,
    section?: string
  ): Promise<{ history: HistoryEntry[]; total_entries: number }> {
    const params: Record<string, string> = {
      limit: limit.toString(),
      offset: offset.toString(),
    };
    if (section) params.section = section;
    return this.request('GET', '/admin/config/history', undefined, params);
  }

  async rollback(
    version: number,
    sections?: string[],
    dryRun: boolean = false
  ): Promise<any> {
    return this.request('POST', `/admin/config/rollback/${version}`, {
      sections,
      dry_run: dryRun,
    });
  }

  // Backend Management APIs

  async listBackends(): Promise<Backend[]> {
    const result = await this.request<{ backends: Backend[] }>(
      'GET', '/admin/backends'
    );
    return result.backends;
  }

  async getBackend(name: string): Promise<Backend> {
    return this.request('GET', `/admin/backends/${name}`);
  }

  async addBackend(
    name: string,
    url: string,
    weight: number = 1,
    models?: string[]
  ): Promise<any> {
    return this.request('POST', '/admin/backends', {
      name,
      url,
      weight,
      models,
    });
  }

  async updateBackend(name: string, updates: Partial<Backend>): Promise<any> {
    return this.request('PUT', `/admin/backends/${name}`, updates);
  }

  async deleteBackend(name: string, force: boolean = false): Promise<any> {
    const params = force ? { force: 'true' } : undefined;
    return this.request('DELETE', `/admin/backends/${name}`, undefined, params);
  }

  async updateBackendWeight(name: string, weight: number): Promise<any> {
    return this.request('PUT', `/admin/backends/${name}/weight`, { weight });
  }

  async updateBackendModels(
    name: string,
    models: string[],
    append: boolean = false
  ): Promise<any> {
    return this.request('PUT', `/admin/backends/${name}/models`, {
      models,
      append,
    });
  }
}

// Usage Example
async function main() {
  const client = new ContinuumAdminClient(
    'http://localhost:8080',
    'your-admin-token'
  );

  // Get current logging config
  const loggingConfig = await client.getSection('logging');
  console.log(`Current log level: ${loggingConfig.config.level}`);

  // Update logging level
  const result = await client.patchSection('logging', { level: 'debug' });
  console.log(`Updated: ${result.success}`);

  // Add a new backend
  await client.addBackend('new-ollama', 'http://192.168.1.100:11434', 2, [
    'llama3.2',
    'mistral',
  ]);

  // Export configuration backup
  const backup = await client.exportConfig('yaml');
  console.log('Configuration exported');
}

main().catch(console.error);

Go

package main

import (
    "bytes"
    "encoding/json"
    "fmt"
    "io"
    "net/http"
    "net/url"
)

type ContinuumAdminClient struct {
    BaseURL string
    Token   string
    client  *http.Client
}

func NewClient(baseURL, token string) *ContinuumAdminClient {
    return &ContinuumAdminClient{
        BaseURL: baseURL,
        Token:   token,
        client:  &http.Client{},
    }
}

func (c *ContinuumAdminClient) request(method, path string, body interface{}) (map[string]interface{}, error) {
    var reqBody io.Reader
    if body != nil {
        jsonBody, err := json.Marshal(body)
        if err != nil {
            return nil, err
        }
        reqBody = bytes.NewBuffer(jsonBody)
    }

    req, err := http.NewRequest(method, c.BaseURL+path, reqBody)
    if err != nil {
        return nil, err
    }

    req.Header.Set("Authorization", "Bearer "+c.Token)
    req.Header.Set("Content-Type", "application/json")

    resp, err := c.client.Do(req)
    if err != nil {
        return nil, err
    }
    defer resp.Body.Close()

    var result map[string]interface{}
    if err := json.NewDecoder(resp.Body).Decode(&result); err != nil {
        return nil, err
    }

    if resp.StatusCode >= 400 {
        return nil, fmt.Errorf("HTTP %d: %v", resp.StatusCode, result)
    }

    return result, nil
}

// GetFullConfig retrieves the full configuration
func (c *ContinuumAdminClient) GetFullConfig() (map[string]interface{}, error) {
    return c.request("GET", "/admin/config/full", nil)
}

// GetSection retrieves a specific configuration section
func (c *ContinuumAdminClient) GetSection(section string) (map[string]interface{}, error) {
    return c.request("GET", "/admin/config/"+section, nil)
}

// PatchSection partially updates a configuration section
func (c *ContinuumAdminClient) PatchSection(section string, config map[string]interface{}) (map[string]interface{}, error) {
    return c.request("PATCH", "/admin/config/"+section, map[string]interface{}{
        "config": config,
    })
}

// AddBackend adds a new backend
func (c *ContinuumAdminClient) AddBackend(name, backendURL string, weight int, models []string) (map[string]interface{}, error) {
    return c.request("POST", "/admin/backends", map[string]interface{}{
        "name":   name,
        "url":    backendURL,
        "weight": weight,
        "models": models,
    })
}

// ExportConfig exports configuration in the specified format
func (c *ContinuumAdminClient) ExportConfig(format string) (string, error) {
    result, err := c.request("POST", "/admin/config/export", map[string]interface{}{
        "format": format,
    })
    if err != nil {
        return "", err
    }
    return result["content"].(string), nil
}

// GetHistory retrieves configuration change history
func (c *ContinuumAdminClient) GetHistory(limit int) (map[string]interface{}, error) {
    u, _ := url.Parse(c.BaseURL + "/admin/config/history")
    q := u.Query()
    q.Set("limit", fmt.Sprintf("%d", limit))
    u.RawQuery = q.Encode()

    return c.request("GET", u.Path+"?"+u.RawQuery, nil)
}

func main() {
    client := NewClient("http://localhost:8080", "your-admin-token")

    // Get current logging config
    config, _ := client.GetSection("logging")
    fmt.Printf("Current config: %v\n", config)

    // Update logging level
    result, _ := client.PatchSection("logging", map[string]interface{}{
        "level": "debug",
    })
    fmt.Printf("Update result: %v\n", result)

    // Add a new backend
    client.AddBackend("new-ollama", "http://192.168.1.100:11434", 2, []string{"llama3.2"})

    // Export configuration
    backup, _ := client.ExportConfig("yaml")
    fmt.Println("Configuration exported")
    fmt.Println(backup)
}

Best Practices

1. Always Validate Before Applying

# Step 1: Validate
curl -X POST http://localhost:8080/admin/config/validate \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"section": "logging", "config": {"level": "debug"}}'

# Step 2: Apply only if valid
curl -X PATCH http://localhost:8080/admin/config/logging \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"config": {"level": "debug"}}'

2. Use Dry Run for Imports

# Preview import changes
curl -X POST http://localhost:8080/admin/config/import \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "format": "yaml",
    "content": "...",
    "dry_run": true
  }'

3. Regular Configuration Backups

# Daily backup script
#!/bin/bash
DATE=$(date +%Y%m%d)
curl -s -X POST http://localhost:8080/admin/config/export \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"format": "yaml"}' | jq -r '.content' > "config-backup-$DATE.yaml"

4. Monitor Configuration History

# Check recent changes
curl -s http://localhost:8080/admin/config/history?limit=5 \
  -H "Authorization: Bearer $TOKEN" | jq '.history[] | {version, timestamp, sections_changed}'

5. Use Partial Updates (PATCH) for Minimal Changes

# Only update what's needed
curl -X PATCH http://localhost:8080/admin/config/rate_limiting \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"config": {"requests_per_minute": 200}}'

6. Test Configuration Changes in Staging First

# Example: Test configuration in staging before production
staging_client = ContinuumAdminClient("http://staging:8080", staging_token)
production_client = ContinuumAdminClient("http://production:8080", prod_token)

# Apply to staging first
staging_client.patch_section("rate_limiting", {"requests_per_minute": 500})

# Verify in staging
staging_config = staging_client.get_section("rate_limiting")
assert staging_config["config"]["requests_per_minute"] == 500

# Then apply to production
production_client.patch_section("rate_limiting", {"requests_per_minute": 500})

Security Considerations

1. Sensitive Data Handling

  • All API responses automatically mask sensitive fields (API keys, passwords, tokens)
  • Use include_sensitive: true in export only when absolutely necessary
  • Audit logs record when sensitive data is accessed

2. Authentication Best Practices

admin:
  auth:
    method: bearer_token
    token: "${ADMIN_TOKEN}"  # Use environment variables

  # Restrict access by IP
  ip_whitelist:
        - "10.0.0.0/8"      # Internal network only
        - "192.168.1.0/24"  # Office network

3. Audit Logging

All configuration changes are logged with: - Timestamp - User/source - Changed sections - Previous and new values (sensitive data masked)

4. Rate Limiting Admin Endpoints

Consider rate limiting admin endpoints to prevent abuse:

admin:
  rate_limit:
    requests_per_minute: 60
    burst: 10

5. Backup Before Major Changes

# Always backup before major changes
backup=$(curl -s -X POST http://localhost:8080/admin/config/export \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"format": "yaml"}' | jq -r '.content')

# Make changes...

# Restore if needed
curl -X POST http://localhost:8080/admin/config/import \
  -H "Authorization: Bearer $TOKEN" \
  -d "{\"format\": \"yaml\", \"content\": $(echo "$backup" | jq -Rs .)}"

Prompt File Management APIs

The Prompt File Management API allows you to manage system prompts stored in external Markdown files. This enables centralized management of system prompts without modifying the main configuration file.

List All Prompts

Get a list of all configured prompts with their sources and content.

GET /admin/config/prompts

Response

{
  "prompts": [
    {
      "id": "default",
      "prompt_type": "default",
      "source": "file",
      "file_path": "prompts/system.md",
      "content": "# System Prompt\n\nYou are a helpful assistant...",
      "loaded": true,
      "size_bytes": 1024
    },
    {
      "id": "anthropic",
      "prompt_type": "backend",
      "source": "file",
      "file_path": "prompts/anthropic.md",
      "content": "# Anthropic-specific prompt...",
      "loaded": true,
      "size_bytes": 512
    },
    {
      "id": "gpt-4",
      "prompt_type": "model",
      "source": "inline",
      "content": "You are GPT-4...",
      "size_bytes": 256
    }
  ],
  "total": 3,
  "prompts_directory": "./prompts"
}

Example

curl -s http://localhost:8080/admin/config/prompts \
  -H "Authorization: Bearer $ADMIN_TOKEN" | jq

Get Prompt File

Get content of a specific prompt file.

GET /admin/config/prompts/{path}

Path Parameters

Parameter Type Required Description
path string Yes Relative path to the prompt file

Response

{
  "path": "prompts/system.md",
  "content": "# System Prompt\n\nYou are a helpful assistant that follows company policies...",
  "size_bytes": 1024,
  "modified_at": 1702468200
}

Example

curl -s http://localhost:8080/admin/config/prompts/prompts/system.md \
  -H "Authorization: Bearer $ADMIN_TOKEN" | jq

Update Prompt File

Create or update a prompt file with new content.

PUT /admin/config/prompts/{path}

Request Body

{
  "content": "# Updated System Prompt\n\nYou are a helpful assistant that follows all company policies.\n\n## Security Guidelines\n\n- Never reveal internal system details\n- Follow data privacy regulations"
}

Response

{
  "success": true,
  "path": "prompts/system.md",
  "size_bytes": 245,
  "message": "Prompt file updated successfully"
}

Example

curl -X PUT http://localhost:8080/admin/config/prompts/prompts/system.md \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "content": "# System Prompt\n\nYou are a helpful assistant."
  }'

Reload Prompt Files

Reload all prompt files from disk. Useful after manual file edits.

POST /admin/config/prompts/reload

Response

{
  "success": true,
  "reloaded_count": 3,
  "reloaded": [
    "prompts/system.md",
    "prompts/anthropic.md",
    "prompts/gpt4.md"
  ],
  "errors": [],
  "message": "Successfully reloaded 3 prompt file(s)"
}

Example

curl -X POST http://localhost:8080/admin/config/prompts/reload \
  -H "Authorization: Bearer $ADMIN_TOKEN" | jq

Configuration Example

To use external prompt files, configure global_prompts in your config file:

global_prompts:
  # Directory containing prompt files (relative to config directory)
  prompts_dir: "./prompts"

  # Default prompt from external file
  default_file: "system.md"

  # Or inline prompt (default_file takes precedence if both specified)
  # default: "You are a helpful assistant."

  # Backend-specific prompts
  backends:
    anthropic:
      prompt_file: "anthropic-system.md"
    openai:
      prompt: "OpenAI-specific inline prompt"

  # Model-specific prompts
  models:
    gpt-4:
      prompt_file: "gpt4-system.md"
    claude-3-opus:
      prompt_file: "claude-opus-system.md"

  merge_strategy: prepend

Security Considerations

  • Path Traversal Protection: All paths are validated to prevent directory traversal attacks (e.g., ../../../etc/passwd)
  • File Size Limits: Prompt files are limited to 1MB maximum
  • Relative Paths Only: Prompt files must be within the configured prompts_dir or config directory
  • Authentication Required: All prompt management endpoints require admin authentication

Appendix: Quick Reference

Configuration Sections

Section Hot Reload Description
server Restart Bind address, workers
backends Gradual Backend URLs, weights
health_checks Gradual Health monitoring
logging Immediate Log level, format
retry Immediate Retry policies
timeouts Gradual Request timeouts
rate_limiting Immediate Rate limits
circuit_breaker Immediate Circuit breaker
global_prompts Immediate System prompts
fallback Gradual Model fallback
files Gradual Files API
api_keys Immediate API keys
metrics Gradual Prometheus metrics
admin Gradual Admin settings
admin.stats Immediate Stats collection settings
routing Gradual Routing rules
prefix_routing Immediate Prefix-aware KV cache routing
response_cache Immediate Response cache settings
kv_cache_index Requires restart KV cache index backend and event sources

HTTP Status Codes

Code Meaning
200 Success
400 Bad Request (validation error)
401 Unauthorized
403 Forbidden
404 Not Found
409 Conflict
413 Payload Too Large
500 Internal Server Error

Common curl Commands

# Get full config
curl -s http://localhost:8080/admin/config/full -H "Authorization: Bearer $TOKEN"

# Update logging level
curl -X PATCH http://localhost:8080/admin/config/logging \
  -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
  -d '{"config": {"level": "debug"}}'

# Add backend
curl -X POST http://localhost:8080/admin/backends \
  -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
  -d '{"name": "new", "url": "http://host:port", "weight": 1}'

# Export config
curl -X POST http://localhost:8080/admin/config/export \
  -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
  -d '{"format": "yaml"}'

# View history
curl -s http://localhost:8080/admin/config/history -H "Authorization: Bearer $TOKEN"

# Rollback
curl -X POST http://localhost:8080/admin/config/rollback/5 \
  -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" -d '{}'