Web Clients
LA Router can run on a cloud server to provide AI routing for web applications. In this deployment, LA Router acts as a server-side proxy — web clients send requests to LA Router's API, and it routes them to the appropriate cloud or privately hosted models.
Architecture
In a web deployment, LA Router runs on a cloud VPS (e.g., Google Cloud, DigitalOcean, AWS EC2, Hetzner) inside a Docker container. The web client communicates with LA Router through an HTTPS reverse proxy:
Docker Deployment
LA Router's default development setup uses bun run dev for local use. For server/Docker deployment, you need to build a production bundle:
# Build the backend
cd backend && bun run build
# Build the frontend dashboard
cd frontend && bun run build
A Dockerfile is provided for containerized deployment (see below).
Dockerfile
FROM oven/bun:1.1 AS builder
WORKDIR /app
# Install dependencies
COPY backend/package.json backend/bun.lock ./backend/
RUN cd backend && bun install --frozen-lockfile
COPY frontend/package.json frontend/bun.lock ./frontend/
RUN cd frontend && bun install --frozen-lockfile
# Build
COPY backend/ ./backend/
COPY frontend/ ./frontend/
RUN cd backend && bun run build
RUN cd frontend && bun run build
# Production image
FROM oven/bun:1.1-slim
WORKDIR /app
COPY --from=builder /app/backend/dist ./backend/dist
COPY --from=builder /app/backend/package.json ./backend/
COPY --from=builder /app/backend/node_modules ./backend/node_modules
COPY --from=builder /app/frontend/dist ./frontend/dist
# LA Router serves the dashboard from frontend/dist
ENV NODE_ENV=production
ENV PORT=18790
EXPOSE 18790
CMD ["bun", "run", "backend/dist/server.js"]
Docker Compose
version: "3.8"
services:
larouter:
build: .
ports:
- "18790:18790"
environment:
- GOOGLE_API_KEY=${GOOGLE_API_KEY}
- ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
- OPENAI_API_KEY=${OPENAI_API_KEY}
- NODE_ENV=production
volumes:
- larouter-data:/app/data
restart: unless-stopped
# Optional: reverse proxy with automatic TLS
caddy:
image: caddy:2
ports:
- "80:80"
- "443:443"
volumes:
- ./Caddyfile:/etc/caddy/Caddyfile
- caddy-data:/data
depends_on:
- larouter
volumes:
larouter-data:
caddy-data:
Caddyfile
api.yourapp.com {
reverse_proxy larouter:18790
}
Web Client Integration
The web client calls LA Router exactly like any OpenAI-compatible API:
// lib/ai.ts — Web client AI helper
const AI_BASE_URL = process.env.NEXT_PUBLIC_AI_URL || "https://api.yourapp.com";
export async function chat(messages: Message[], projectToken: string) {
const response = await fetch(`${AI_BASE_URL}/v1/chat/completions`, {
method: "POST",
headers: {
"Content-Type": "application/json",
Authorization: `Bearer lr_${projectToken}`,
},
body: JSON.stringify({
model: "auto",
messages,
stream: true,
}),
});
return response.body; // ReadableStream for SSE
}
Cloud-Only Routing
When LA Router runs on a server without local GPU resources, it operates in cloud-only mode — all requests route to cloud LLM APIs. The routing intelligence still applies:
| Tier | Cloud Model | Use Case |
|---|---|---|
| Heartbeat / Simple | Gemini Flash, GPT-4o-mini | Simple tasks, low cost |
| Moderate | Gemini Pro, Claude Sonnet | Business logic, analysis |
| Complex | GPT-4o, Claude Opus | Complex reasoning |
| Frontier | Gemini Ultra, Claude Opus | Maximum capability |
If your VPS has a GPU (e.g., AWS g5.xlarge, Lambda Labs, RunPod), LA Router can run local models on the server too — giving you the same local-first benefits in the cloud. Download GGUF models via the dashboard and LA Router will use them for lower-tier requests.
Multi-Tenant Web Setup
For SaaS applications serving multiple users or organizations, LA Router provides project-level isolation:
# Create project tokens for each tenant
curl -X POST http://localhost:18790/api/projects \
-H "Content-Type: application/json" \
-d '{"name": "Tenant A", "budget": 100.00}'
# Returns: { "token": "lr_tenant_a_xxx" }
curl -X POST http://localhost:18790/api/projects \
-H "Content-Type: application/json" \
-d '{"name": "Tenant B", "budget": 50.00}'
# Returns: { "token": "lr_tenant_b_xxx" }
Each tenant's usage is tracked separately with per-project budgets, rate limits, and routing preferences.
Security Considerations
| Concern | Solution |
|---|---|
| API key exposure | Keys stored server-side in .env, never sent to browser |
| Authentication | Project bearer tokens (lr_ prefix) validate each request |
| HTTPS | Caddy/NGINX provides automatic TLS termination |
| Rate limiting | Per-project rate limits prevent abuse |
| Budget caps | Hard spending limits per project prevent runaway costs |