Files
lijiaoqiao/llm-gateway-competitors/litellm-wheel-src/litellm/llms/cohere/rerank/guardrail_translation

Cohere Rerank Guardrail Translation Handler

Handler for processing the rerank endpoint (/v1/rerank) with guardrails.

Overview

This handler processes rerank requests by:

  1. Extracting the query text from the request
  2. Applying guardrails to the query
  3. Updating the request with the guardrailed query
  4. Returning the output unchanged (rankings are not text)

Note: Documents are not processed by guardrails as they represent the corpus being searched, not user input. Only the query is guardrailed.

Data Format

Input Format

With String Documents:

{
  "model": "rerank-english-v3.0",
  "query": "What is the capital of France?",
  "documents": [
    "Paris is the capital of France.",
    "Berlin is the capital of Germany.",
    "Madrid is the capital of Spain."
  ],
  "top_n": 2
}

With Dict Documents:

{
  "model": "rerank-english-v3.0",
  "query": "What is the capital of France?",
  "documents": [
    {"text": "Paris is the capital of France.", "id": "doc1"},
    {"text": "Berlin is the capital of Germany.", "id": "doc2"},
    {"text": "Madrid is the capital of Spain.", "id": "doc3"}
  ],
  "top_n": 2
}

Output Format

{
  "id": "rerank-abc123",
  "results": [
    {"index": 0, "relevance_score": 0.98},
    {"index": 2, "relevance_score": 0.12}
  ],
  "meta": {
    "billed_units": {"search_units": 1}
  }
}

Usage

The handler is automatically discovered and applied when guardrails are used with the rerank endpoint.

Example: Using Guardrails with Rerank

curl -X POST 'http://localhost:4000/v1/rerank' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer your-api-key' \
-d '{
    "model": "rerank-english-v3.0",
    "query": "What is machine learning?",
    "documents": [
        "Machine learning is a subset of AI.",
        "Deep learning uses neural networks.",
        "Python is a programming language."
    ],
    "guardrails": ["content_filter"],
    "top_n": 2
}'

The guardrail will be applied to the query only (not the documents).

Example: PII Masking in Query

curl -X POST 'http://localhost:4000/v1/rerank' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer your-api-key' \
-d '{
    "model": "rerank-english-v3.0",
    "query": "Find documents about John Doe from john@example.com",
    "documents": [
        "Document 1 content here.",
        "Document 2 content here.",
        "Document 3 content here."
    ],
    "guardrails": ["mask_pii"],
    "top_n": 3
}'

The query will be masked to: "Find documents about [NAME_REDACTED] from [EMAIL_REDACTED]"

Example: Mixed Document Types

curl -X POST 'http://localhost:4000/v1/rerank' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer your-api-key' \
-d '{
    "model": "rerank-english-v3.0",
    "query": "Technical documentation",
    "documents": [
        {"text": "This is document 1", "metadata": {"source": "wiki"}},
        {"text": "This is document 2", "metadata": {"source": "docs"}},
        "This is document 3 as a plain string"
    ],
    "guardrails": ["content_moderation"]
}'

Implementation Details

Input Processing

  • Query Field: query (string)

    • Processing: Apply guardrail to query text
    • Result: Updated query
  • Documents Field: documents (list)

    • Processing: Not processed (corpus being searched, not user input)
    • Result: Unchanged

Output Processing

  • Processing: Not applicable (output contains relevance scores, not text)
  • Result: Response returned unchanged

Use Cases

  1. PII Protection: Remove PII from queries before reranking
  2. Content Filtering: Filter inappropriate content from search queries
  3. Compliance: Ensure queries meet requirements
  4. Data Sanitization: Clean up query text before semantic search operations

Extension

Override these methods to customize behavior:

  • process_input_messages(): Customize how query is processed
  • process_output_response(): Currently a no-op, but can be overridden if needed

Supported Call Types

  • CallTypes.rerank - Synchronous rerank
  • CallTypes.arerank - Asynchronous rerank

Notes

  • Only the query is processed by guardrails
  • Documents are not processed (they represent the corpus, not user input)
  • Output processing is a no-op since rankings don't contain text
  • Both sync and async call types use the same handler
  • Works with all rerank providers (Cohere, Together AI, etc.)

Common Patterns

import litellm

response = litellm.rerank(
    model="rerank-english-v3.0",
    query="Find info about john@example.com",
    documents=[
        "Document 1 content.",
        "Document 2 content.",
        "Document 3 content."
    ],
    guardrails=["mask_pii"],
    top_n=2
)

# Query will have PII masked
# query becomes: "Find info about [EMAIL_REDACTED]"
print(response.results)

Content Filtering

import litellm

response = litellm.rerank(
    model="rerank-english-v3.0",
    query="Search query here",
    documents=[
        {"text": "Document 1 content", "id": "doc1"},
        {"text": "Document 2 content", "id": "doc2"},
    ],
    guardrails=["content_filter"],
)

Async Rerank with Guardrails

import litellm
import asyncio

async def rerank_with_guardrails():
    response = await litellm.arerank(
        model="rerank-english-v3.0",
        query="Technical query",
        documents=["Doc 1", "Doc 2", "Doc 3"],
        guardrails=["sanitize"],
        top_n=2
    )
    return response

result = asyncio.run(rerank_with_guardrails())