Examples & Migration¶
Multi-Backend Setup¶
# Enterprise multi-backend configuration
server:
bind_address: "0.0.0.0:8080"
workers: 8
connection_pool_size: 400
backends:
# Primary OpenAI GPT models
- name: "openai-primary"
url: "https://api.openai.com"
weight: 5
models: ["gpt-4", "gpt-3.5-turbo"]
retry_override:
max_attempts: 3
base_delay: "500ms"
# Secondary Azure OpenAI
- name: "azure-openai"
url: "https://your-resource.openai.azure.com"
weight: 3
models: ["gpt-4", "gpt-35-turbo"]
# Local Ollama for open models
- name: "local-ollama"
url: "http://ollama:11434"
weight: 2
models: ["llama2", "mistral", "codellama"]
# vLLM deployment
- name: "vllm-cluster"
url: "http://vllm-service:8000"
weight: 4
models: ["meta-llama/Llama-2-7b-chat-hf"]
health_checks:
enabled: true
interval: "45s"
timeout: "15s"
unhealthy_threshold: 3
healthy_threshold: 2
request:
timeout: "180s"
max_retries: 4
cache:
model_cache_ttl: "600s" # 10-minute cache
deduplication_ttl: "120s" # 2-minute deduplication
enable_deduplication: true
logging:
level: "info"
format: "json"
High-Performance Configuration¶
# Optimized for high-throughput scenarios
server:
bind_address: "0.0.0.0:8080"
workers: 16 # High worker count
connection_pool_size: 1000 # Large connection pool
backends:
- name: "fast-backend-1"
url: "http://backend1:8000"
weight: 1
- name: "fast-backend-2"
url: "http://backend2:8000"
weight: 1
- name: "fast-backend-3"
url: "http://backend3:8000"
weight: 1
health_checks:
enabled: true
interval: "30s"
timeout: "5s" # Fast timeout
unhealthy_threshold: 2 # Fail fast
healthy_threshold: 1 # Recover quickly
request:
timeout: "60s" # Shorter timeout for high throughput
max_retries: 2 # Fewer retries
retry:
max_attempts: 2
base_delay: "50ms" # Fast retries
max_delay: "5s"
exponential_backoff: true
jitter: true
cache:
model_cache_ttl: "300s"
deduplication_ttl: "30s" # Shorter deduplication window
enable_deduplication: true
logging:
level: "warn" # Minimal logging for performance
format: "json"
Development Configuration¶
# Developer-friendly configuration
server:
bind_address: "127.0.0.1:8080" # Localhost only
workers: 2 # Fewer workers for development
connection_pool_size: 20 # Small pool
backends:
- name: "local-ollama"
url: "http://localhost:11434"
weight: 1
health_checks:
enabled: true
interval: "10s" # Frequent checks for quick feedback
timeout: "3s"
unhealthy_threshold: 2
healthy_threshold: 1
request:
timeout: "300s" # Long timeout for debugging
max_retries: 1 # Minimal retries for debugging
logging:
level: "debug" # Verbose logging
format: "pretty" # Human-readable
enable_colors: true # Colored output
cache:
model_cache_ttl: "60s" # Short cache for quick testing
deduplication_ttl: "10s" # Short deduplication
enable_deduplication: false # Disable for testing
Migration Guide¶
From Command-Line Arguments¶
If you're currently using command-line arguments, migrate to configuration files:
Before:
After: 1. Generate a configuration file:
-
Edit the configuration:
-
Use the configuration file:
From Environment Variables¶
You can continue using environment variables with configuration files as overrides:
Configuration file (config.yaml):
Environment override:
export CONTINUUM_BIND_ADDRESS="0.0.0.0:9000"
export CONTINUUM_BACKEND_URLS="http://localhost:11434,http://localhost:1234"
continuum-router --config config.yaml
Configuration Validation¶
To validate your configuration without starting the server:
# Test configuration loading
continuum-router --config config.yaml --help
# Check configuration with dry-run (future feature)
continuum-router --config config.yaml --dry-run
Rust Builder API¶
When embedding Continuum Router as a Rust library crate, you can construct configuration programmatically using the type-safe builder API. This avoids YAML files entirely and validates all settings at build time before your application starts.
BackendConfigBuilder¶
BackendConfigBuilder provides provider-specific constructors that set the correct backend type and require only the minimum necessary arguments.
Provider Constructors¶
| Constructor | API Key Required | Description |
|---|---|---|
BackendConfigBuilder::openai(url, api_key) | Yes | OpenAI API |
BackendConfigBuilder::anthropic(url, api_key) | Yes | Anthropic Claude API |
BackendConfigBuilder::gemini(url, api_key) | Yes | Google Gemini API |
BackendConfigBuilder::vllm(url, api_key) | Yes | vLLM inference server |
BackendConfigBuilder::ollama(url) | No | Ollama local server |
BackendConfigBuilder::llamacpp(url) | No | llama.cpp server |
BackendConfigBuilder::mlxcel(url) | No | MLxcel server (macOS, Apple Silicon) |
BackendConfigBuilder::lm_studio(url) | No | LM Studio |
BackendConfigBuilder::custom(backend_type, url) | No | Any backend type |
Optional Builder Methods¶
After calling a provider constructor, chain these methods before calling .build():
.name(name)- Override the backend name (auto-generated as"{provider}-{host}"if omitted).models(vec![...])- Specify available model IDs on this backend.weight(n)- Load-balancing weight from 1 to 1000 (default: 1).timeout(duration)- Per-backend timeout string, e.g."30s"or"2m".max_retries(n)- Maximum retry attempts for this backend.api_key(key)- Set or override the API key (useful withcustom())
Example¶
use continuum_router::config::builder::BackendConfigBuilder;
// OpenAI backend with explicit name and models
let openai = BackendConfigBuilder::openai("https://api.openai.com/v1", "sk-...")
.name("primary-openai")
.models(vec!["gpt-4o", "gpt-4o-mini"])
.weight(3)
.timeout("60s")
.build()
.unwrap();
// Ollama backend with auto-generated name "ollama-localhost"
let ollama = BackendConfigBuilder::ollama("http://localhost:11434")
.models(vec!["llama3.2", "qwen3"])
.build()
.unwrap();
ConfigBuilder¶
ConfigBuilder constructs the top-level router configuration with sensible defaults.
Builder Methods¶
.add_backend(backend)- Add aBackendConfig(at least one required).selection_strategy(strategy)- Load-balancing strategy (default:RoundRobin).bind_address(addr)- TCP address or Unix socket path (default:"0.0.0.0:8080").enable_health_checks(bool)- Toggle background health monitoring (default:true).health_check_interval(duration)- Health check frequency (default:"30s").enable_rate_limiting(config)- Attach aRateLimitConfig.enable_circuit_breaker(config)- Attach aCircuitBreakerConfig.cors(config)- Set CORS configuration.api_keys(config)- Set API key authentication configuration.logging_level(level)- Logging verbosity:"error","warn","info","debug","trace"(default:"info")
Example¶
use continuum_router::config::builder::{BackendConfigBuilder, ConfigBuilder};
use continuum_router::core::models::backend::SelectionStrategy;
let backend = BackendConfigBuilder::ollama("http://localhost:11434")
.name("local")
.build()
.unwrap();
let config = ConfigBuilder::new()
.add_backend(backend)
.selection_strategy(SelectionStrategy::LeastLatency)
.bind_address("127.0.0.1:8080")
.enable_health_checks(true)
.health_check_interval("60s")
.logging_level("debug")
.build()
.unwrap();
ConfigBuilderError¶
Both builders return Result<_, ConfigBuilderError> from .build(). The error variants are:
| Variant | Cause |
|---|---|
InvalidUrl | The URL cannot be parsed |
MissingApiKey | A required API key is empty or missing |
NoBackends | ConfigBuilder::build() called with no backends added |
DuplicateBackendName | Two backends share the same name |
InvalidBindAddress | The bind address is not a valid TCP or Unix socket address |
ValidationError | A field value is out of range (e.g., weight outside 1..=1000) |
All variants implement std::error::Error and Display for ergonomic error handling.
Importing Builder Types¶
// Import builders from the top-level crate re-exports
use continuum_router::{BackendConfigBuilder, ConfigBuilder, ConfigBuilderError};
// Or from the config module
use continuum_router::config::builder::{BackendConfigBuilder, ConfigBuilder, ConfigBuilderError};
This configuration guide provides comprehensive coverage of all configuration options available in Continuum Router. The flexible configuration system allows you to adapt the router to any deployment scenario while maintaining clear precedence rules and validation.