Examples & Migration¶
Multi-Backend Setup¶
# Enterprise multi-backend configuration
server:
bind_address: "0.0.0.0:8080"
workers: 8
connection_pool_size: 400
backends:
# Primary OpenAI GPT models
- name: "openai-primary"
url: "https://api.openai.com"
weight: 5
models: ["gpt-4", "gpt-3.5-turbo"]
retry_override:
max_attempts: 3
base_delay: "500ms"
# Secondary Azure OpenAI
- name: "azure-openai"
url: "https://your-resource.openai.azure.com"
weight: 3
models: ["gpt-4", "gpt-35-turbo"]
# Local Ollama for open models
- name: "local-ollama"
url: "http://ollama:11434"
weight: 2
models: ["llama2", "mistral", "codellama"]
# vLLM deployment
- name: "vllm-cluster"
url: "http://vllm-service:8000"
weight: 4
models: ["meta-llama/Llama-2-7b-chat-hf"]
health_checks:
enabled: true
interval: "45s"
timeout: "15s"
unhealthy_threshold: 3
healthy_threshold: 2
request:
timeout: "180s"
max_retries: 4
cache:
model_cache_ttl: "600s" # 10-minute cache
deduplication_ttl: "120s" # 2-minute deduplication
enable_deduplication: true
logging:
level: "info"
format: "json"
High-Performance Configuration¶
# Optimized for high-throughput scenarios
server:
bind_address: "0.0.0.0:8080"
workers: 16 # High worker count
connection_pool_size: 1000 # Large connection pool
backends:
- name: "fast-backend-1"
url: "http://backend1:8000"
weight: 1
- name: "fast-backend-2"
url: "http://backend2:8000"
weight: 1
- name: "fast-backend-3"
url: "http://backend3:8000"
weight: 1
health_checks:
enabled: true
interval: "30s"
timeout: "5s" # Fast timeout
unhealthy_threshold: 2 # Fail fast
healthy_threshold: 1 # Recover quickly
request:
timeout: "60s" # Shorter timeout for high throughput
max_retries: 2 # Fewer retries
retry:
max_attempts: 2
base_delay: "50ms" # Fast retries
max_delay: "5s"
exponential_backoff: true
jitter: true
cache:
model_cache_ttl: "300s"
deduplication_ttl: "30s" # Shorter deduplication window
enable_deduplication: true
logging:
level: "warn" # Minimal logging for performance
format: "json"
Development Configuration¶
# Developer-friendly configuration
server:
bind_address: "127.0.0.1:8080" # Localhost only
workers: 2 # Fewer workers for development
connection_pool_size: 20 # Small pool
backends:
- name: "local-ollama"
url: "http://localhost:11434"
weight: 1
health_checks:
enabled: true
interval: "10s" # Frequent checks for quick feedback
timeout: "3s"
unhealthy_threshold: 2
healthy_threshold: 1
request:
timeout: "300s" # Long timeout for debugging
max_retries: 1 # Minimal retries for debugging
logging:
level: "debug" # Verbose logging
format: "pretty" # Human-readable
enable_colors: true # Colored output
cache:
model_cache_ttl: "60s" # Short cache for quick testing
deduplication_ttl: "10s" # Short deduplication
enable_deduplication: false # Disable for testing
Migration Guide¶
From Command-Line Arguments¶
If you're currently using command-line arguments, migrate to configuration files:
Before:
After: 1. Generate a configuration file:
-
Edit the configuration:
-
Use the configuration file:
From Environment Variables¶
You can continue using environment variables with configuration files as overrides:
Configuration file (config.yaml):
Environment override:
export CONTINUUM_BIND_ADDRESS="0.0.0.0:9000"
export CONTINUUM_BACKEND_URLS="http://localhost:11434,http://localhost:1234"
continuum-router --config config.yaml
Configuration Validation¶
To validate your configuration without starting the server:
# Test configuration loading
continuum-router --config config.yaml --help
# Check configuration with dry-run (future feature)
continuum-router --config config.yaml --dry-run
Rust Builder API¶
When embedding Continuum Router as a Rust library crate, you can construct configuration programmatically using the type-safe builder API. This avoids YAML files entirely and validates all settings at build time before your application starts.
BackendConfigBuilder¶
BackendConfigBuilder provides provider-specific constructors that set the correct backend
type and require only the minimum necessary arguments.
Provider Constructors¶
| Constructor | API Key Required | Description |
|---|---|---|
BackendConfigBuilder::openai(url, api_key) |
Yes | OpenAI API |
BackendConfigBuilder::anthropic(url, api_key) |
Yes | Anthropic Claude API |
BackendConfigBuilder::gemini(url, api_key) |
Yes | Google Gemini API |
BackendConfigBuilder::vllm(url, api_key) |
Yes | vLLM inference server |
BackendConfigBuilder::ollama(url) |
No | Ollama local server |
BackendConfigBuilder::llamacpp(url) |
No | llama.cpp server |
BackendConfigBuilder::mlxcel(url) |
No | MLxcel server (macOS, Apple Silicon) |
BackendConfigBuilder::lm_studio(url) |
No | LM Studio |
BackendConfigBuilder::custom(backend_type, url) |
No | Any backend type |
Optional Builder Methods¶
After calling a provider constructor, chain these methods before calling .build():
.name(name)- Override the backend name (auto-generated as"{provider}-{host}"if omitted).models(vec![...])- Specify available model IDs on this backend.weight(n)- Load-balancing weight from 1 to 1000 (default: 1).timeout(duration)- Per-backend timeout string, e.g."30s"or"2m".max_retries(n)- Maximum retry attempts for this backend.api_key(key)- Set or override the API key (useful withcustom())
Example¶
use continuum_router::config::builder::BackendConfigBuilder;
// OpenAI backend with explicit name and models
let openai = BackendConfigBuilder::openai("https://api.openai.com/v1", "sk-...")
.name("primary-openai")
.models(vec!["gpt-4o", "gpt-4o-mini"])
.weight(3)
.timeout("60s")
.build()
.unwrap();
// Ollama backend with auto-generated name "ollama-localhost"
let ollama = BackendConfigBuilder::ollama("http://localhost:11434")
.models(vec!["llama3.2", "qwen3"])
.build()
.unwrap();
ConfigBuilder¶
ConfigBuilder constructs the top-level router configuration with sensible defaults.
Builder Methods¶
.add_backend(backend)- Add aBackendConfig(at least one required).selection_strategy(strategy)- Load-balancing strategy (default:RoundRobin).bind_address(addr)- TCP address or Unix socket path (default:"0.0.0.0:8080").enable_health_checks(bool)- Toggle background health monitoring (default:true).health_check_interval(duration)- Health check frequency (default:"30s").enable_rate_limiting(config)- Attach aRateLimitConfig.enable_circuit_breaker(config)- Attach aCircuitBreakerConfig.cors(config)- Set CORS configuration.api_keys(config)- Set API key authentication configuration.logging_level(level)- Logging verbosity:"error","warn","info","debug","trace"(default:"info")
Example¶
use continuum_router::config::builder::{BackendConfigBuilder, ConfigBuilder};
use continuum_router::core::models::backend::SelectionStrategy;
let backend = BackendConfigBuilder::ollama("http://localhost:11434")
.name("local")
.build()
.unwrap();
let config = ConfigBuilder::new()
.add_backend(backend)
.selection_strategy(SelectionStrategy::LeastLatency)
.bind_address("127.0.0.1:8080")
.enable_health_checks(true)
.health_check_interval("60s")
.logging_level("debug")
.build()
.unwrap();
ConfigBuilderError¶
Both builders return Result<_, ConfigBuilderError> from .build(). The error variants are:
| Variant | Cause |
|---|---|
InvalidUrl |
The URL cannot be parsed |
MissingApiKey |
A required API key is empty or missing |
NoBackends |
ConfigBuilder::build() called with no backends added |
DuplicateBackendName |
Two backends share the same name |
InvalidBindAddress |
The bind address is not a valid TCP or Unix socket address |
ValidationError |
A field value is out of range (e.g., weight outside 1..=1000) |
All variants implement std::error::Error and Display for ergonomic error handling.
Importing Builder Types¶
// Import builders from the top-level crate re-exports
use continuum_router::{BackendConfigBuilder, ConfigBuilder, ConfigBuilderError};
// Or from the config module
use continuum_router::config::builder::{BackendConfigBuilder, ConfigBuilder, ConfigBuilderError};
This configuration guide covers all configuration options available in Continuum Router. The configuration system adapts to any deployment scenario with clear precedence rules and validation.