Quick Start¶
This guide will help you get Continuum Router up and running in minutes.
Prerequisites¶
- A running LLM backend (OpenAI API, Ollama, vLLM, LocalAI, LM Studio, etc.)
- Network access to your backend endpoints
Installation¶
Download Binary¶
Build from Source¶
# Clone repository
git clone https://github.com/lablup/continuum-router.git
cd continuum-router
# Build and install
cargo build --release
sudo mv target/release/continuum-router /usr/local/bin/
Configuration¶
Generate Default Configuration¶
Basic Configuration Example¶
backends:
- url: http://localhost:11434
name: ollama
models: ["llama3.2", "qwen3"]
- url: http://localhost:1234
name: lm-studio
models: ["gpt-4", "claude-3"]
selection_strategy: LeastLatency
health_checks:
enabled: true
interval: 30s
Configuration with Rate Limiting¶
backends:
- url: http://localhost:11434
name: ollama
models: ["llama3.2", "qwen3"]
selection_strategy: LeastLatency
health_checks:
enabled: true
interval: 30s
rate_limiting:
enabled: true
storage: memory
limits:
per_client:
requests_per_second: 10
burst_capacity: 20
per_backend:
requests_per_second: 100
burst_capacity: 200
global:
requests_per_second: 1000
burst_capacity: 2000
whitelist:
- "192.168.1.0/24"
- "10.0.0.1"
bypass_keys:
- "admin-key-123"
Running the Router¶
Start the Server¶
Verify It's Running¶
# Check health endpoint
curl http://localhost:8080/health
# List available models
curl http://localhost:8080/v1/models
Using the API¶
Chat Completion¶
curl -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "llama3.2",
"messages": [{"role": "user", "content": "Hello!"}]
}'
Streaming Chat Completion¶
curl -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "llama3.2",
"messages": [{"role": "user", "content": "Tell me a story"}],
"stream": true
}'
List Models¶
Next Steps¶
- Installation Guide - Detailed installation instructions for all platforms
- Configuration Guide - Complete configuration reference
- API Reference - Full API documentation
- Load Balancing - Configure load balancing strategies
- Deployment Guide - Production deployment with Docker, Kubernetes, systemd
Troubleshooting¶
Common Issues¶
Router fails to start¶
Check that:
- The configuration file exists and is valid YAML
- Backend URLs are accessible
- No other service is using port 8080
No backends available¶
Verify that:
- Your backends are running and healthy
- The backend URLs in your configuration are correct
- Health checks can reach your backends
Connection refused errors¶
Ensure:
- Your firewall allows connections to the backends
- The backends are listening on the configured ports
- Network routes are properly configured
For more help, see the Error Handling guide or open an issue on GitHub.