71 lines
2.7 KiB
Markdown
71 lines
2.7 KiB
Markdown
|
|
# Pass-Through Endpoints Architecture
|
||
|
|
|
||
|
|
## Why Pass-Through Endpoints Transform Requests
|
||
|
|
|
||
|
|
Even "pass-through" endpoints must perform essential transformations. The request **body** passes through unchanged, but:
|
||
|
|
|
||
|
|
```mermaid
|
||
|
|
sequenceDiagram
|
||
|
|
participant Client
|
||
|
|
participant Proxy as LiteLLM Proxy
|
||
|
|
participant Provider as LLM Provider
|
||
|
|
|
||
|
|
Client->>Proxy: POST /vertex_ai/v1/projects/.../generateContent
|
||
|
|
Note over Client,Proxy: Headers: Authorization: Bearer sk-litellm-key
|
||
|
|
Note over Client,Proxy: Body: { "contents": [...] }
|
||
|
|
|
||
|
|
rect rgb(240, 240, 240)
|
||
|
|
Note over Proxy: 1. URL Construction
|
||
|
|
Note over Proxy: Build regional/provider-specific URL
|
||
|
|
end
|
||
|
|
|
||
|
|
rect rgb(240, 240, 240)
|
||
|
|
Note over Proxy: 2. Auth Header Replacement
|
||
|
|
Note over Proxy: LiteLLM key → provider credentials
|
||
|
|
end
|
||
|
|
|
||
|
|
rect rgb(240, 240, 240)
|
||
|
|
Note over Proxy: 3. Extra Operations
|
||
|
|
Note over Proxy: • x-pass-* headers (strip prefix, forward)
|
||
|
|
Note over Proxy: • x-litellm-tags → metadata
|
||
|
|
Note over Proxy: • Guardrails (opt-in)
|
||
|
|
Note over Proxy: • Multipart form reconstruction
|
||
|
|
end
|
||
|
|
|
||
|
|
Proxy->>Provider: POST https://us-central1-aiplatform.googleapis.com/...
|
||
|
|
Note over Proxy,Provider: Headers: Authorization: Bearer ya29.google-oauth...
|
||
|
|
Note over Proxy,Provider: Body: { "contents": [...] } ← UNCHANGED
|
||
|
|
|
||
|
|
Provider-->>Proxy: Response (streaming or non-streaming)
|
||
|
|
|
||
|
|
rect rgb(240, 240, 240)
|
||
|
|
Note over Proxy: 4. Response Handling (async)
|
||
|
|
Note over Proxy: • Collect streaming chunks for logging
|
||
|
|
Note over Proxy: • Cost injection (if enabled)
|
||
|
|
Note over Proxy: • Parse response → calculate cost → log
|
||
|
|
end
|
||
|
|
|
||
|
|
Proxy-->>Client: Response (unchanged)
|
||
|
|
```
|
||
|
|
|
||
|
|
## Essential Transformations
|
||
|
|
|
||
|
|
- **URL Construction** - Build correct provider URL (e.g., regional endpoints for Vertex AI, Bedrock)
|
||
|
|
- **Auth Header Replacement** - Swap LiteLLM virtual key for actual provider credentials
|
||
|
|
|
||
|
|
## Extra Operations
|
||
|
|
|
||
|
|
| Operation | Description |
|
||
|
|
|-----------|-------------|
|
||
|
|
| `x-pass-*` headers | Strip prefix and forward (e.g., `x-pass-anthropic-beta` → `anthropic-beta`) |
|
||
|
|
| `x-litellm-tags` header | Extract tags and add to request metadata for logging |
|
||
|
|
| Streaming chunk collection | Collect chunks async for logging after stream completes |
|
||
|
|
| Multipart form handling | Reconstruct multipart/form-data requests for file uploads |
|
||
|
|
| Guardrails (opt-in) | Run content filtering when explicitly configured |
|
||
|
|
| Cost injection | Inject cost into streaming chunks when `include_cost_in_streaming_usage` enabled |
|
||
|
|
|
||
|
|
## What Does NOT Change
|
||
|
|
|
||
|
|
- Request body
|
||
|
|
- Response body
|
||
|
|
- Provider-specific parameters
|