Production Deployment Guide¶

This guide covers various deployment strategies, configurations, and best practices for running Continuum Router in production environments.

Table of Contents¶

Deployment Options
Docker Deployment
Kubernetes Deployment
Systemd Service
Cloud Deployments
High Availability Setup
Performance Tuning
Security Hardening
Monitoring and Observability
Backup and Recovery

Deployment Options¶

Method	Best For	Pros	Cons
Docker	Single instance, development	Easy setup, portable	Single point of failure
Kubernetes	Large scale, auto-scaling	HA, auto-scaling, orchestration	Complex setup
Systemd	Bare metal, VMs	Direct control, simple	Manual scaling
Docker Swarm	Medium scale	Simple orchestration	Limited features
Cloud PaaS	Managed deployment	Low maintenance	Vendor lock-in

Docker Deployment¶

Continuum Router provides two Docker image options:

Image	Base	Size	Use Case
`lablup/continuum-router:VERSION`	Debian Bookworm	~50MB	General use, better compatibility
`lablup/continuum-router:VERSION-alpine`	Alpine 3.20	~10MB	Minimal size, Kubernetes

Quick Start with Docker Compose¶

The fastest way to get started is using Docker Compose:

# Create a configuration file
curl -fsSL https://raw.githubusercontent.com/lablup/continuum-router/main/config.yaml.example > config.yaml

# Edit config.yaml to add your backends and API keys
# Then start the router
docker compose up -d

# View logs
docker compose logs -f continuum-router

Running with Docker¶

# Run with default configuration
docker run -d \
  --name continuum-router \
  -p 8080:8080 \
  -v $(pwd)/config.yaml:/etc/continuum-router/config.yaml:ro \
  lablup/continuum-router:latest

# Run Alpine variant for smaller image
docker run -d \
  --name continuum-router \
  -p 8080:8080 \
  -v $(pwd)/config.yaml:/etc/continuum-router/config.yaml:ro \
  lablup/continuum-router:latest-alpine

# Run with custom log level
docker run -d \
  --name continuum-router \
  -p 8080:8080 \
  -e RUST_LOG=debug \
  -v $(pwd)/config.yaml:/etc/continuum-router/config.yaml:ro \
  lablup/continuum-router:latest

Building Custom Images¶

Two Dockerfiles are provided in the repository:

Dockerfile - Debian-based image using pre-built binaries
Dockerfile.alpine - Alpine-based image using musl binaries

Build from Pre-built Binaries (Recommended)¶

This method downloads pre-built binaries from GitHub Releases:

# Build Debian-based image
docker build --build-arg VERSION=0.21.0 -t continuum-router:0.21.0 .

# Build Alpine-based image
docker build -f Dockerfile.alpine --build-arg VERSION=0.21.0 -t continuum-router:0.21.0-alpine .

# Multi-platform build with buildx
docker buildx build --platform linux/amd64,linux/arm64 \
  --build-arg VERSION=0.21.0 \
  -t lablup/continuum-router:0.21.0 \
  --push .

Build from Source¶

For development or customization, use the multi-stage build:

# Multi-stage build for optimal size
FROM rust:1.75-slim as builder

# Install build dependencies
RUN apt-get update && apt-get install -y \
    pkg-config \
    libssl-dev \
    && rm -rf /var/lib/apt/lists/*

WORKDIR /app

# Cache dependencies
COPY Cargo.toml Cargo.lock ./
RUN mkdir src && echo "fn main() {}" > src/main.rs
RUN cargo build --release && rm -rf src

# Build application
COPY . .
RUN touch src/main.rs && cargo build --release

# Runtime image
FROM debian:bookworm-slim

# Install runtime dependencies
RUN apt-get update && apt-get install -y \
    ca-certificates \
    && rm -rf /var/lib/apt/lists/*

# Create non-root user
RUN useradd -r -s /bin/false continuum

# Copy binary
COPY --from=builder /app/target/release/continuum-router /usr/local/bin/
RUN chmod +x /usr/local/bin/continuum-router

# Set up directories
RUN mkdir -p /etc/continuum-router /var/log/continuum-router
RUN chown -R continuum:continuum /etc/continuum-router /var/log/continuum-router

USER continuum

EXPOSE 8080

HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
  CMD ["/usr/local/bin/continuum-router", "--health-check"]

ENTRYPOINT ["continuum-router"]
CMD ["--config", "/etc/continuum-router/config.yaml"]

Health Checks¶

Continuum Router includes a built-in health check command for container orchestration:

# Check health from within container
continuum-router --health-check

# Check health with custom URL
continuum-router --health-check --health-check-url http://localhost:8080/health

The health check:

Returns exit code 0 if the server is healthy
Returns exit code 1 if the server is unreachable or unhealthy
Has a 5-second timeout by default

Docker Compose Production Setup¶

services:
  continuum-router:
    image: lablup/continuum-router:latest
    container_name: continuum-router
    restart: always
    ports:
      - "8080:8080"
    volumes:
      - ./config.yaml:/etc/continuum-router/config.yaml:ro
      - ./logs:/var/log/continuum-router
    environment:
      - RUST_LOG=info
      - RUST_BACKTRACE=1
    networks:
      - llm-network
    deploy:
      resources:
        limits:
          cpus: '2'
          memory: 512M
        reservations:
          cpus: '1'
          memory: 256M
    healthcheck:
      test: ["CMD", "continuum-router", "--health-check"]
      interval: 30s
      timeout: 3s
      retries: 3
      start_period: 5s
    logging:
      driver: "json-file"
      options:
        max-size: "10m"
        max-file: "3"

  # Example backend services
  ollama:
    image: ollama/ollama:latest
    volumes:
      - ollama_data:/root/.ollama
    networks:
      - llm-network
    deploy:
      resources:
        limits:
          cpus: '4'
          memory: 8G

networks:
  llm-network:
    driver: bridge

volumes:
  ollama_data:

Docker Swarm Deployment¶

services:
  continuum-router:
    image: lablup/continuum-router:latest
    deploy:
      replicas: 3
      update_config:
        parallelism: 1
        delay: 10s
        failure_action: rollback
      restart_policy:
        condition: any
        delay: 5s
        max_attempts: 3
      placement:
        constraints:
          - node.role == worker
    ports:
      - "8080:8080"
    configs:
      - source: router_config
        target: /etc/continuum-router/config.yaml
    networks:
      - llm-overlay
    healthcheck:
      test: ["CMD", "continuum-router", "--health-check"]
      interval: 30s
      timeout: 3s
      retries: 3
      start_period: 5s

configs:
  router_config:
    external: true

networks:
  llm-overlay:
    driver: overlay
    attachable: true

Kubernetes Deployment¶

Complete Kubernetes Manifests¶

# namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
  name: continuum-router

---
# configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: continuum-router-config
  namespace: continuum-router
data:
  config.yaml: |
    server:
      bind_address: "0.0.0.0:8080"
      workers: 4
    backends:
      - url: http://ollama-service:11434
        name: ollama
        weight: 1
    health_checks:
      enabled: true
      interval: 30s
      timeout: 10s

---
# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: continuum-router
  namespace: continuum-router
  labels:
    app: continuum-router
spec:
  replicas: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
  selector:
    matchLabels:
      app: continuum-router
  template:
    metadata:
      labels:
        app: continuum-router
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "8080"
        prometheus.io/path: "/metrics"
    spec:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values:
                  - continuum-router
              topologyKey: kubernetes.io/hostname
      containers:
      - name: continuum-router
        image: ghcr.io/lablup/continuum-router:latest
        imagePullPolicy: Always
        ports:
        - name: http
          containerPort: 8080
          protocol: TCP
        env:
        - name: RUST_LOG
          value: "info"
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        volumeMounts:
        - name: config
          mountPath: /etc/continuum-router
          readOnly: true
        resources:
          requests:
            memory: "256Mi"
            cpu: "200m"
          limits:
            memory: "512Mi"
            cpu: "1000m"
        livenessProbe:
          httpGet:
            path: /health
            port: http
          initialDelaySeconds: 30
          periodSeconds: 10
          timeoutSeconds: 5
          failureThreshold: 3
        readinessProbe:
          httpGet:
            path: /health
            port: http
          initialDelaySeconds: 10
          periodSeconds: 5
          timeoutSeconds: 3
          failureThreshold: 3
        startupProbe:
          httpGet:
            path: /health
            port: http
          initialDelaySeconds: 0
          periodSeconds: 10
          timeoutSeconds: 3
          failureThreshold: 30
      volumes:
      - name: config
        configMap:
          name: continuum-router-config

---
# service.yaml
apiVersion: v1
kind: Service
metadata:
  name: continuum-router
  namespace: continuum-router
  labels:
    app: continuum-router
spec:
  type: LoadBalancer
  ports:
    - port: 80
    targetPort: http
    protocol: TCP
    name: http
  selector:
    app: continuum-router

---
# hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: continuum-router
  namespace: continuum-router
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: continuum-router
  minReplicas: 3
  maxReplicas: 10
  metrics:
    - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
    - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80

---
# ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: continuum-router
  namespace: continuum-router
  annotations:
    nginx.ingress.kubernetes.io/proxy-body-size: "100m"
    nginx.ingress.kubernetes.io/proxy-read-timeout: "600"
    nginx.ingress.kubernetes.io/proxy-send-timeout: "600"
spec:
  ingressClassName: nginx
  rules:
    - host: api.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: continuum-router
            port:
              number: 80
  tls:
    - hosts:
        - api.example.com
    secretName: continuum-router-tls

Helm Chart¶

# values.yaml
replicaCount: 3

image:
  repository: ghcr.io/lablup/continuum-router
  pullPolicy: IfNotPresent
  tag: "latest"

service:
  type: LoadBalancer
  port: 80
  targetPort: 8080

ingress:
  enabled: true
  className: nginx
  annotations:
    cert-manager.io/cluster-issuer: letsencrypt-prod
  hosts:
        - host: api.example.com
      paths:
        - path: /
          pathType: Prefix
  tls:
        - secretName: continuum-router-tls
      hosts:
        - api.example.com

resources:
  limits:
    cpu: 1000m
    memory: 512Mi
  requests:
    cpu: 200m
    memory: 256Mi

autoscaling:
  enabled: true
  minReplicas: 3
  maxReplicas: 10
  targetCPUUtilizationPercentage: 70
  targetMemoryUtilizationPercentage: 80

config:
  backends:
        - url: http://ollama:11434
      name: ollama
  health_checks:
    enabled: true
    interval: 30s

Systemd Service¶

Installation Script¶

#!/bin/bash
# install-systemd.sh

# Create user
sudo useradd -r -s /bin/false continuum

# Create directories
sudo mkdir -p /etc/continuum-router
sudo mkdir -p /var/log/continuum-router
sudo mkdir -p /opt/continuum-router

# Copy binary
sudo cp continuum-router /usr/local/bin/
sudo chmod +x /usr/local/bin/continuum-router

# Copy configuration
sudo cp config.yaml /etc/continuum-router/

# Set permissions
sudo chown -R continuum:continuum /etc/continuum-router
sudo chown -R continuum:continuum /var/log/continuum-router

# Install service file
sudo cp continuum-router.service /etc/systemd/system/

# Enable and start service
sudo systemctl daemon-reload
sudo systemctl enable continuum-router
sudo systemctl start continuum-router

Service File¶

# /etc/systemd/system/continuum-router.service
[Unit]
Description=Continuum Router - LLM API Router
Documentation=https://github.com/lablup/backend.ai-continuum
After=network-online.target
Wants=network-online.target

[Service]
Type=simple
User=continuum
Group=continuum
WorkingDirectory=/opt/continuum-router

# Service execution
ExecStart=/usr/local/bin/continuum-router --config /etc/continuum-router/config.yaml
ExecReload=/bin/kill -USR1 $MAINPID

# Restart configuration
Restart=always
RestartSec=10
TimeoutStopSec=30

# Security hardening
NoNewPrivileges=true
PrivateTmp=true
ProtectSystem=strict
ProtectHome=true
ReadWritePaths=/var/log/continuum-router
ProtectKernelTunables=true
ProtectKernelModules=true
ProtectControlGroups=true
RestrictRealtime=true
RestrictNamespaces=true
RestrictSUIDSGID=true
PrivateDevices=true
SystemCallFilter=@system-service

# Resource limits
LimitNOFILE=65536
LimitNPROC=4096

# Environment
Environment="RUST_LOG=continuum_router=info"
Environment="RUST_BACKTRACE=1"

[Install]
WantedBy=multi-user.target

Cloud Deployments¶

AWS ECS¶

{
  "family": "continuum-router",
  "taskRoleArn": "arn:aws:iam::ACCOUNT_ID:role/ecsTaskRole",
  "executionRoleArn": "arn:aws:iam::ACCOUNT_ID:role/ecsTaskExecutionRole",
  "networkMode": "awsvpc",
  "requiresCompatibilities": ["FARGATE"],
  "cpu": "1024",
  "memory": "2048",
  "containerDefinitions": [
    {
      "name": "continuum-router",
      "image": "ghcr.io/lablup/continuum-router:latest",
      "portMappings": [
        {
          "containerPort": 8080,
          "protocol": "tcp"
        }
      ],
      "essential": true,
      "environment": [
        {
          "name": "RUST_LOG",
          "value": "info"
        }
      ],
      "mountPoints": [
        {
          "sourceVolume": "config",
          "containerPath": "/etc/continuum-router"
        }
      ],
      "logConfiguration": {
        "logDriver": "awslogs",
        "options": {
          "awslogs-group": "/ecs/continuum-router",
          "awslogs-region": "us-west-2",
          "awslogs-stream-prefix": "ecs"
        }
      },
      "healthCheck": {
        "command": ["CMD-SHELL", "curl -f http://localhost:8080/health || exit 1"],
        "interval": 30,
        "timeout": 5,
        "retries": 3,
        "startPeriod": 60
      }
    }
  ],
  "volumes": [
    {
      "name": "config",
      "efsVolumeConfiguration": {
        "fileSystemId": "fs-12345678",
        "rootDirectory": "/config"
      }
    }
  ]
}

Google Cloud Run¶

# service.yaml
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: continuum-router
  annotations:
    run.googleapis.com/ingress: all
spec:
  template:
    metadata:
      annotations:
        autoscaling.knative.dev/minScale: "1"
        autoscaling.knative.dev/maxScale: "100"
    spec:
      containerConcurrency: 1000
      timeoutSeconds: 300
      containers:
      - image: gcr.io/PROJECT_ID/continuum-router:latest
        ports:
        - containerPort: 8080
        env:
        - name: RUST_LOG
          value: info
        resources:
          limits:
            cpu: "2"
            memory: 2Gi
        livenessProbe:
          httpGet:
            path: /health
          initialDelaySeconds: 10
          periodSeconds: 10

Azure Container Instances¶

{
  "location": "eastus",
  "properties": {
    "containers": [
      {
        "name": "continuum-router",
        "properties": {
          "image": "ghcr.io/lablup/continuum-router:latest",
          "ports": [
            {
              "port": 8080,
              "protocol": "TCP"
            }
          ],
          "resources": {
            "requests": {
              "cpu": 1.0,
              "memoryInGB": 1.5
            }
          },
          "environmentVariables": [
            {
              "name": "RUST_LOG",
              "value": "info"
            }
          ],
          "livenessProbe": {
            "httpGet": {
              "path": "/health",
              "port": 8080
            },
            "initialDelaySeconds": 30,
            "periodSeconds": 10
          }
        }
      }
    ],
    "osType": "Linux",
    "ipAddress": {
      "type": "Public",
      "ports": [
        {
          "port": 80,
          "protocol": "TCP"
        }
      ]
    }
  }
}

High Availability Setup¶

Multi-Region Deployment¶

# Global Load Balancer Configuration
regions:
    - name: us-west
    endpoints:
      - https://us-west-1.api.example.com
      - https://us-west-2.api.example.com
    weight: 33

    - name: eu-central
    endpoints:
      - https://eu-central-1.api.example.com
      - https://eu-central-2.api.example.com
    weight: 33

    - name: asia-pacific
    endpoints:
      - https://ap-southeast-1.api.example.com
      - https://ap-southeast-2.api.example.com
    weight: 34

health_check:
  path: /health
  interval: 10s
  timeout: 5s
  healthy_threshold: 2
  unhealthy_threshold: 3

failover:
  primary: us-west
  secondary: eu-central
  tertiary: asia-pacific

Database Replication¶

# PostgreSQL HA Configuration
postgresql:
  primary:
    host: primary.db.example.com
    port: 5432

  replicas:
        - host: replica1.db.example.com
      port: 5432
        - host: replica2.db.example.com
      port: 5432

  pooling:
    max_connections: 100
    connection_timeout: 10s

  failover:
    automatic: true
    promote_timeout: 30s

Performance Tuning¶

High-Load Configuration¶

# config-highload.yaml
server:
  bind_address: "0.0.0.0:8080"
  workers: 16                   # 2x CPU cores
  connection_pool_size: 1000    # Large pool for many backends
  keepalive_timeout: 75s        # Match ALB timeout

request:
  timeout: "30s"                # Lower timeout for responsiveness
  max_retries: 1                # Minimal retries
  buffer_size: 65536            # 64KB buffer

health_checks:
  enabled: true
  interval: "60s"               # Less frequent under load
  timeout: "10s"
  parallel: true                # Parallel health checks

cache:
  model_cache_ttl: "900s"       # 15 min cache
  enable_deduplication: true
  max_entries: 10000

rate_limiting:
  enabled: true
  requests_per_minute: 1000
  burst_size: 100

logging:
  level: "warn"
  format: "json"
  buffer_size: 8192

Memory-Optimized Configuration¶

# config-memory.yaml
server:
  connection_pool_size: 25      # Minimal connections

cache:
  model_cache_ttl: "60s"        # Short TTL
  max_entries: 100              # Limited cache size

request:
  buffer_size: 8192             # 8KB buffer

logging:
  level: "error"
  buffer_size: 1024

CPU-Optimized Configuration¶

# config-cpu.yaml
server:
  workers: 32                   # Maximize parallelism

threading:
  tokio_worker_threads: 16
  blocking_threads: 8

request:
  parallel_backend_queries: true

selection_strategy: LeastLatency  # CPU-efficient routing

Security Hardening¶

Network Security¶

# firewall-rules.yaml
ingress:
    - protocol: tcp
    port: 8080
    source: 10.0.0.0/8      # Internal only

    - protocol: tcp
    port: 443
    source: 0.0.0.0/0       # HTTPS from anywhere

egress:
    - protocol: tcp
    port: 443
    destination: 0.0.0.0/0   # HTTPS to anywhere

    - protocol: tcp
    port: 11434
    destination: 10.0.0.0/8  # Backend communication

TLS Configuration¶

# tls-config.yaml
tls:
  enabled: true
  cert_file: /etc/ssl/certs/server.crt
  key_file: /etc/ssl/private/server.key

  # TLS 1.2+ only
  min_version: "1.2"

  # Strong ciphers only
  ciphers:
        - TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
        - TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256

  client_auth:
    enabled: false
    ca_file: /etc/ssl/certs/ca.crt

Authentication¶

Continuum Router supports API key authentication with configurable enforcement modes.

Authentication Modes¶

Mode	Description
`permissive` (default)	Requests without API key are allowed. Backward compatible.
`blocking`	Only authenticated requests are processed. Recommended for production.

Production Configuration¶

# config.yaml - Production authentication setup
api_keys:
  # Enable blocking mode for mandatory authentication
  mode: blocking

  # Define API keys
  api_keys:
        - key: "${PROD_API_KEY}"           # Use environment variable
      id: "key-production-1"
      user_id: "prod-user"
      organization_id: "prod-org"
      scopes: [read, write, files]
      rate_limit: 1000
      enabled: true

  # Or load from external file for better security
  api_keys_file: "/etc/continuum-router/api-keys.yaml"

External Key File Format¶

# /etc/continuum-router/api-keys.yaml
keys:
    - key: "sk-prod-xxxxxxxxxxxxx"
    id: "key-external-1"
    user_id: "service-account"
    organization_id: "production"
    scopes: [read, write, files]
    enabled: true

Making Authenticated Requests¶

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Authorization: Bearer sk-your-api-key" \
  -H "Content-Type: application/json" \
  -d '{"model": "gpt-4", "messages": [{"role": "user", "content": "Hello"}]}'

Protected Endpoints (blocking mode)¶

/v1/chat/completions
/v1/completions
/v1/responses
/v1/images/generations
/v1/images/edits
/v1/images/variations
/v1/models

Note: Health endpoints (/health, /healthz) are always accessible. Admin, Files, and Metrics endpoints have separate authentication.

Per-API-Key Rate Limiting¶

Each API key can have individual rate limits:

api_keys:
  mode: blocking
  api_keys:
        - key: "${PREMIUM_KEY}"
      id: "premium-user"
      rate_limit: 5000        # 5000 requests per minute
      scopes: [read, write, files, admin]

        - key: "${STANDARD_KEY}"
      id: "standard-user"
      rate_limit: 100         # 100 requests per minute
      scopes: [read, write]

Monitoring and Observability¶

Prometheus Integration¶

# prometheus-config.yaml
metrics:
  enabled: true
  endpoint: /metrics

  # Cardinality limits
  max_labels_per_metric: 10
  max_unique_label_values: 100

  # Custom metrics
  custom:
        - name: llm_request_duration
      type: histogram
      buckets: [0.1, 0.5, 1, 2, 5, 10, 30, 60]
        - name: backend_errors
      type: counter
      labels: [backend, error_type]

Logging Configuration¶

# logging-config.yaml
logging:
  level: info
  format: json

  outputs:
        - type: stdout
      level: info

        - type: file
      path: /var/log/continuum-router/app.log
      rotation:
        size: 100MB
        count: 10
        compress: true

        - type: syslog
      address: syslog.example.com:514
      facility: local0

  structured_fields:
    service: continuum-router
    environment: production
    version: ${VERSION}

Tracing¶

# tracing-config.yaml
tracing:
  enabled: true

  exporter:
    type: otlp
    endpoint: http://jaeger:4317

  sampling:
    rate: 0.1  # Sample 10% of requests

  propagation:
        - tracecontext
        - baggage

Backup and Recovery¶

Configuration Backup¶

#!/bin/bash
# backup-config.sh

BACKUP_DIR="/backup/continuum-router"
TIMESTAMP=$(date +%Y%m%d_%H%M%S)

# Create backup directory
mkdir -p $BACKUP_DIR

# Backup configuration
tar -czf $BACKUP_DIR/config_$TIMESTAMP.tar.gz \
  /etc/continuum-router/

# Backup logs
tar -czf $BACKUP_DIR/logs_$TIMESTAMP.tar.gz \
  /var/log/continuum-router/

# Keep only last 30 days
find $BACKUP_DIR -name "*.tar.gz" -mtime +30 -delete

Disaster Recovery Plan¶

Regular Backups
Configuration: Daily
Logs: Weekly
Metrics: Monthly
Recovery Time Objectives
RTO: < 1 hour
RPO: < 24 hours

Recovery Procedures

# Restore from backup
tar -xzf /backup/config_latest.tar.gz -C /

# Restart service
systemctl restart continuum-router

# Verify health
curl http://localhost:8080/health

Troubleshooting¶

Common Issues¶

High Memory Usage¶

# Check memory usage
ps aux | grep continuum-router

# Analyze heap dump
gdb -p $(pidof continuum-router)
(gdb) gcore memory.dump

Connection Issues¶

# Check open connections
netstat -an | grep 8080

# Test backend connectivity
curl -I http://backend:11434/v1/models

Performance Degradation¶

# Enable debug logging
export RUST_LOG=debug
systemctl restart continuum-router

# Monitor metrics
curl http://localhost:8080/metrics | grep latency