Quick Start¶
This guide will help you get Continuum Router up and running in minutes.
Prerequisites¶
- A running LLM backend (OpenAI API, Anthropic, Gemini, Ollama, vLLM, LocalAI, LM Studio, llama.cpp, MLxcel, Continuum Router, etc.)
- Network access to your backend endpoints
Installation¶
Download Binary¶
Build from Source¶
# Clone repository
git clone https://github.com/lablup/continuum-router.git
cd continuum-router
# Build and install
cargo build --release
sudo mv target/release/continuum-router /usr/local/bin/
Configuration¶
Generate Default Configuration¶
Basic Configuration Example¶
backends:
- url: http://localhost:11434
name: ollama
models: ["llama3.2", "qwen3"]
- url: http://localhost:1234
name: lm-studio
models: ["gpt-4", "claude-3"]
selection_strategy: LeastLatency
health_checks:
enabled: true
interval: 30s
Configuration with Rate Limiting¶
backends:
- url: http://localhost:11434
name: ollama
models: ["llama3.2", "qwen3"]
selection_strategy: LeastLatency
health_checks:
enabled: true
interval: 30s
rate_limiting:
enabled: true
storage: memory
limits:
per_client:
requests_per_second: 10
burst_capacity: 20
per_backend:
requests_per_second: 100
burst_capacity: 200
global:
requests_per_second: 1000
burst_capacity: 2000
whitelist:
- "192.168.1.0/24"
- "10.0.0.1"
bypass_keys:
- "admin-key-123"
Running the Router¶
Start the Server¶
Verify It's Running¶
# Check health endpoint
curl http://localhost:8080/health
# List available models
curl http://localhost:8080/v1/models
Using the API¶
Chat Completion¶
curl -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "llama3.2",
"messages": [{"role": "user", "content": "Hello!"}]
}'
Streaming Chat Completion¶
curl -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "llama3.2",
"messages": [{"role": "user", "content": "Tell me a story"}],
"stream": true
}'
List Models¶
ACP Mode (Agent Communication Protocol)¶
Continuum Router also supports a stdio-based JSON-RPC 2.0 transport for IDE and tool integrations. In this mode, the router communicates via stdin/stdout instead of HTTP.
This is useful for integrating with IDEs, coding agents, and other tools that spawn the router as a subprocess. See the ACP Usage Guide for details.
Next Steps¶
- Installation Guide - Detailed installation instructions for all platforms
- Configuration Guide - Complete configuration reference
- API Reference - Full API documentation
- ACP Usage Guide - IDE and tool integration via JSON-RPC 2.0 stdio transport
- Library Usage - Embed the router in your Rust application as a library
- Load Balancing - Configure load balancing strategies
- Deployment Guide - Production deployment with Docker, Kubernetes, systemd
Troubleshooting¶
Common Issues¶
Router fails to start¶
Check that:
- The configuration file exists and is valid YAML
- Backend URLs are accessible
- No other service is using port 8080
No backends available¶
Verify that:
- Your backends are running and healthy
- The backend URLs in your configuration are correct
- Health checks can reach your backends
Connection refused errors¶
Ensure:
- Your firewall allows connections to the backends
- The backends are listening on the configured ports
- Network routes are properly configured
For more help, see the Error Handling guide or open an issue on GitHub.