How Echo Works
Technical architecture guide for Echo's proxy layer, authentication, and billing system
How Echo Works
Echo is a proxy layer that handles authentication, metering, and billing for LLM applications. This guide explains the technical architecture and integration patterns.
System Overview & Architecture
Echo operates as a three-layer stack that sits between your application and LLM providers:
graph TB
A[Your Application] --> B[Echo SDK]
B --> C[Echo Proxy Layer]
C --> D[OpenAI/Anthropic/Claude]
C --> E[Billing & Analytics]
C --> F[User Management]
Core Components
Echo Control (echo-control
)
- Web dashboard for app management, user analytics, billing
- PostgreSQL database with Prisma ORM
- NextAuth sessions for web UI authentication
Echo Server (echo-server
)
- API proxy that intercepts LLM requests
- Handles authentication validation and usage metering
- Routes requests to appropriate LLM providers
- Records transactions and calculates costs in real-time
SDK Ecosystem
echo-typescript-sdk
: Universal client for all platformsecho-react-sdk
: OAuth2+PKCE for client-side LLM callsecho-next-sdk
: Server-side integration patterns
Request Flow
Every LLM request follows this path:
- Authentication: SDK includes API key or JWT token
- Validation: Echo proxy validates credentials and permissions
- Routing: Request forwarded to appropriate LLM provider
- Metering: Response tokens counted and costs calculated
- Billing: Transaction recorded with user/app attribution
- Response: LLM response returned to your application
Echo acts as a transparent proxy. Your application uses standard OpenAI SDK patterns but gains billing, analytics, and user management automatically.
The Proxy Layer
Echo Server handles the complex middleware between your app and LLM providers. Here's how each component works:
Request Interception
All LLM requests route through Echo's proxy endpoints:
// Your app calls this
const response = await openai.chat.completions.create({
model: "gpt-4",
messages: [{ role: "user", content: "Hello" }]
});
// Echo intercepts at https://echo.router.merit.systems/v1/chat/completions
// Validates auth, meters usage, forwards to OpenAI, records billing
The proxy preserves OpenAI's API contract while injecting billing logic:
// echo-server/src/routes/chat/completions.ts
export async function handleChatCompletion(request: AuthenticatedRequest) {
// 1. Extract and validate authentication
const { user, echoApp } = request.auth;
// 2. Forward to LLM provider
const llmResponse = await provider.chat.completions.create(request.body);
// 3. Calculate costs from token usage
const cost = calculateTokenCost(llmResponse.usage, request.body.model);
// 4. Record transaction
await recordTransaction({
userId: user.id,
echoAppId: echoApp.id,
cost,
tokens: llmResponse.usage
});
return llmResponse;
}
Authentication & Security Architecture
Echo supports two distinct authentication patterns optimized for different use cases:
API Key Authentication (Backend/Server-Side)
Traditional server-side pattern for backend applications:
// Your server code
import { EchoClient } from '@merit-systems/echo-typescript-sdk';
const client = new EchoClient({
apiKey: process.env.ECHO_API_KEY // Scoped to specific Echo app
});
const response = await client.models.chatCompletion({
model: "gpt-4",
messages: [{ role: "user", content: "Hello" }]
});
API key validation flow:
- Extract key from
Authorization: Bearer <key>
header - Hash and lookup in database with app association
- Validate key is active and app is not archived
- Attach user and app context to request
OAuth2 + PKCE Authentication (Frontend/Client-Side)
Pattern that enables secure LLM calls directly from browsers:
// React component - no API keys needed
import { useEchoOpenAI } from '@merit-systems/echo-react-sdk';
function ChatComponent() {
const { openai, isReady } = useEchoOpenAI();
// This runs in the browser with user-scoped JWT tokens
const response = await openai.chat.completions.create({
model: "gpt-4",
messages: [{ role: "user", content: "Hello" }]
});
}
OAuth2 + PKCE flow eliminates API key exposure:
- Authorization Request: User redirects to Echo OAuth endpoint
- User Consent: User authorizes your app to access their Echo balance
- Code Exchange: Your app exchanges authorization code for JWT tokens
- Token Usage: Short-lived JWTs authenticate LLM requests
- Automatic Refresh: SDK handles token renewal transparently
This dual authentication model supports both traditional backend patterns and modern frontend architectures while maintaining security and proper billing attribution.