Technical architecture guide for Echo's proxy layer, authentication, and billing system

How Echo Works

Echo is a proxy layer that handles authentication, metering, and billing for LLM applications. This guide explains the technical architecture and integration patterns.

System Overview & Architecture

Echo operates as a three-layer stack that sits between your application and LLM providers:

Core Components

Echo Control (packages/app/control)

Web dashboard for app management, user analytics, billing
PostgreSQL database with Prisma ORM
NextAuth sessions for web UI authentication

Echo Server (packages/app/server)

API proxy that intercepts LLM requests
Handles authentication validation and usage metering
Routes requests to appropriate LLM providers
Records transactions and calculates costs in real-time

SDK Ecosystem

echo-typescript-sdk: Universal client for all platforms
echo-react-sdk: OAuth2+PKCE for client-side LLM calls
echo-next-sdk: Server-side integration patterns

Request Flow

Every LLM request follows this path:

Authentication: SDK includes API key or JWT token
Validation: Echo proxy validates credentials and permissions
Routing: Request forwarded to appropriate LLM provider
Metering: Response tokens counted and costs calculated
Billing: Transaction recorded with user/app attribution
Response: LLM response returned to your application

Echo acts as a transparent proxy. Your application uses standard OpenAI SDK patterns but gains billing, analytics, and user management automatically.

The Proxy Layer

Echo Server handles the complex middleware between your app and LLM providers. Here's how each component works:

Request Interception

All LLM requests route through Echo's proxy endpoints:

// Your app calls this
const response = await openai.chat.completions.create({
  model: "gpt-5",
  messages: [{ role: "user", content: "Hello" }]
});

// Echo intercepts at https://echo.router.merit.systems/v1/chat/completions
// Validates auth, meters usage, forwards to OpenAI, records billing

The proxy preserves OpenAI's API contract while injecting billing logic:

// echo-server/src/routes/chat/completions.ts
export async function handleChatCompletion(request: AuthenticatedRequest) {
  // 1. Extract and validate authentication
  const { user, echoApp } = request.auth;
  
  // 2. Forward to LLM provider
  const llmResponse = await provider.chat.completions.create(request.body);
  
  // 3. Calculate costs from token usage
  const cost = calculateTokenCost(llmResponse.usage, request.body.model);
  
  // 4. Record transaction
  await recordTransaction({
    userId: user.id,
    echoAppId: echoApp.id,
    cost,
    tokens: llmResponse.usage
  });
  
  return llmResponse;
}

Authentication & Security Architecture

Echo supports two distinct authentication patterns optimized for different use cases:

API Key Authentication (Backend/Server-Side)

Traditional server-side pattern for backend applications:

// Your server code
import { EchoClient } from '@merit-systems/echo-typescript-sdk';

const client = new EchoClient({
  apiKey: process.env.ECHO_API_KEY // Scoped to specific Echo app
});

const response = await client.models.chatCompletion({
  model: "gpt-5",
  messages: [{ role: "user", content: "Hello" }]
});

API key validation flow:

Extract key from Authorization: Bearer <key> header
Hash and lookup in database with app association
Validate key is active and app is not archived
Attach user and app context to request

OAuth2 + PKCE Authentication (Frontend/Client-Side)

Pattern that enables secure LLM calls directly from browsers:

// React component - no API keys needed
import { useEchoOpenAI } from '@merit-systems/echo-react-sdk';

function ChatComponent() {
  const { openai, isReady } = useEchoOpenAI();
  
  // This runs in the browser with user-scoped JWT tokens
  const response = await openai.chat.completions.create({
    model: "gpt-5", 
    messages: [{ role: "user", content: "Hello" }]
  });
}

OAuth2 + PKCE flow eliminates API key exposure:

Authorization Request: User redirects to Echo OAuth endpoint
User Consent: User authorizes your app to access their Echo balance
Code Exchange: Your app exchanges authorization code for JWT tokens
Token Usage: Short-lived JWTs authenticate LLM requests
Automatic Refresh: SDK handles token renewal transparently

This dual authentication model supports both traditional backend patterns and modern frontend architectures while maintaining security and proper billing attribution.

How Echo Works

On this page