Skip to main content

Web Clients

LA Router can run on a cloud server to provide AI routing for web applications. In this deployment, LA Router acts as a server-side proxy — web clients send requests to LA Router's API, and it routes them to the appropriate cloud or privately hosted models.

Architecture

In a web deployment, LA Router runs on a cloud VPS (e.g., Google Cloud, DigitalOcean, AWS EC2, Hetzner) inside a Docker container. The web client communicates with LA Router through an HTTPS reverse proxy:

Docker Deployment

Docker Build Required

LA Router's default development setup uses bun run dev for local use. For server/Docker deployment, you need to build a production bundle:

# Build the backend
cd backend && bun run build

# Build the frontend dashboard
cd frontend && bun run build

A Dockerfile is provided for containerized deployment (see below).

Dockerfile

FROM oven/bun:1.1 AS builder

WORKDIR /app

# Install dependencies
COPY backend/package.json backend/bun.lock ./backend/
RUN cd backend && bun install --frozen-lockfile

COPY frontend/package.json frontend/bun.lock ./frontend/
RUN cd frontend && bun install --frozen-lockfile

# Build
COPY backend/ ./backend/
COPY frontend/ ./frontend/
RUN cd backend && bun run build
RUN cd frontend && bun run build

# Production image
FROM oven/bun:1.1-slim

WORKDIR /app
COPY --from=builder /app/backend/dist ./backend/dist
COPY --from=builder /app/backend/package.json ./backend/
COPY --from=builder /app/backend/node_modules ./backend/node_modules
COPY --from=builder /app/frontend/dist ./frontend/dist

# LA Router serves the dashboard from frontend/dist
ENV NODE_ENV=production
ENV PORT=18790
EXPOSE 18790

CMD ["bun", "run", "backend/dist/server.js"]

Docker Compose

version: "3.8"

services:
larouter:
build: .
ports:
- "18790:18790"
environment:
- GOOGLE_API_KEY=${GOOGLE_API_KEY}
- ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
- OPENAI_API_KEY=${OPENAI_API_KEY}
- NODE_ENV=production
volumes:
- larouter-data:/app/data
restart: unless-stopped

# Optional: reverse proxy with automatic TLS
caddy:
image: caddy:2
ports:
- "80:80"
- "443:443"
volumes:
- ./Caddyfile:/etc/caddy/Caddyfile
- caddy-data:/data
depends_on:
- larouter

volumes:
larouter-data:
caddy-data:

Caddyfile

api.yourapp.com {
reverse_proxy larouter:18790
}

Web Client Integration

The web client calls LA Router exactly like any OpenAI-compatible API:

// lib/ai.ts — Web client AI helper
const AI_BASE_URL = process.env.NEXT_PUBLIC_AI_URL || "https://api.yourapp.com";

export async function chat(messages: Message[], projectToken: string) {
const response = await fetch(`${AI_BASE_URL}/v1/chat/completions`, {
method: "POST",
headers: {
"Content-Type": "application/json",
Authorization: `Bearer lr_${projectToken}`,
},
body: JSON.stringify({
model: "auto",
messages,
stream: true,
}),
});

return response.body; // ReadableStream for SSE
}

Cloud-Only Routing

When LA Router runs on a server without local GPU resources, it operates in cloud-only mode — all requests route to cloud LLM APIs. The routing intelligence still applies:

TierCloud ModelUse Case
Heartbeat / SimpleGemini Flash, GPT-4o-miniSimple tasks, low cost
ModerateGemini Pro, Claude SonnetBusiness logic, analysis
ComplexGPT-4o, Claude OpusComplex reasoning
FrontierGemini Ultra, Claude OpusMaximum capability
GPU-Equipped VPS

If your VPS has a GPU (e.g., AWS g5.xlarge, Lambda Labs, RunPod), LA Router can run local models on the server too — giving you the same local-first benefits in the cloud. Download GGUF models via the dashboard and LA Router will use them for lower-tier requests.

Multi-Tenant Web Setup

For SaaS applications serving multiple users or organizations, LA Router provides project-level isolation:

# Create project tokens for each tenant
curl -X POST http://localhost:18790/api/projects \
-H "Content-Type: application/json" \
-d '{"name": "Tenant A", "budget": 100.00}'
# Returns: { "token": "lr_tenant_a_xxx" }

curl -X POST http://localhost:18790/api/projects \
-H "Content-Type: application/json" \
-d '{"name": "Tenant B", "budget": 50.00}'
# Returns: { "token": "lr_tenant_b_xxx" }

Each tenant's usage is tracked separately with per-project budgets, rate limits, and routing preferences.

Security Considerations

ConcernSolution
API key exposureKeys stored server-side in .env, never sent to browser
AuthenticationProject bearer tokens (lr_ prefix) validate each request
HTTPSCaddy/NGINX provides automatic TLS termination
Rate limitingPer-project rate limits prevent abuse
Budget capsHard spending limits per project prevent runaway costs