Cohere Rerank Guardrail Translation Handler
Handler for processing the rerank endpoint (/v1/rerank) with guardrails.
Overview
This handler processes rerank requests by:
- Extracting the query text from the request
- Applying guardrails to the query
- Updating the request with the guardrailed query
- Returning the output unchanged (rankings are not text)
Note: Documents are not processed by guardrails as they represent the corpus being searched, not user input. Only the query is guardrailed.
Data Format
Input Format
With String Documents:
{
"model": "rerank-english-v3.0",
"query": "What is the capital of France?",
"documents": [
"Paris is the capital of France.",
"Berlin is the capital of Germany.",
"Madrid is the capital of Spain."
],
"top_n": 2
}
With Dict Documents:
{
"model": "rerank-english-v3.0",
"query": "What is the capital of France?",
"documents": [
{"text": "Paris is the capital of France.", "id": "doc1"},
{"text": "Berlin is the capital of Germany.", "id": "doc2"},
{"text": "Madrid is the capital of Spain.", "id": "doc3"}
],
"top_n": 2
}
Output Format
{
"id": "rerank-abc123",
"results": [
{"index": 0, "relevance_score": 0.98},
{"index": 2, "relevance_score": 0.12}
],
"meta": {
"billed_units": {"search_units": 1}
}
}
Usage
The handler is automatically discovered and applied when guardrails are used with the rerank endpoint.
Example: Using Guardrails with Rerank
curl -X POST 'http://localhost:4000/v1/rerank' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer your-api-key' \
-d '{
"model": "rerank-english-v3.0",
"query": "What is machine learning?",
"documents": [
"Machine learning is a subset of AI.",
"Deep learning uses neural networks.",
"Python is a programming language."
],
"guardrails": ["content_filter"],
"top_n": 2
}'
The guardrail will be applied to the query only (not the documents).
Example: PII Masking in Query
curl -X POST 'http://localhost:4000/v1/rerank' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer your-api-key' \
-d '{
"model": "rerank-english-v3.0",
"query": "Find documents about John Doe from john@example.com",
"documents": [
"Document 1 content here.",
"Document 2 content here.",
"Document 3 content here."
],
"guardrails": ["mask_pii"],
"top_n": 3
}'
The query will be masked to: "Find documents about [NAME_REDACTED] from [EMAIL_REDACTED]"
Example: Mixed Document Types
curl -X POST 'http://localhost:4000/v1/rerank' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer your-api-key' \
-d '{
"model": "rerank-english-v3.0",
"query": "Technical documentation",
"documents": [
{"text": "This is document 1", "metadata": {"source": "wiki"}},
{"text": "This is document 2", "metadata": {"source": "docs"}},
"This is document 3 as a plain string"
],
"guardrails": ["content_moderation"]
}'
Implementation Details
Input Processing
-
Query Field:
query(string)- Processing: Apply guardrail to query text
- Result: Updated query
-
Documents Field:
documents(list)- Processing: Not processed (corpus being searched, not user input)
- Result: Unchanged
Output Processing
- Processing: Not applicable (output contains relevance scores, not text)
- Result: Response returned unchanged
Use Cases
- PII Protection: Remove PII from queries before reranking
- Content Filtering: Filter inappropriate content from search queries
- Compliance: Ensure queries meet requirements
- Data Sanitization: Clean up query text before semantic search operations
Extension
Override these methods to customize behavior:
process_input_messages(): Customize how query is processedprocess_output_response(): Currently a no-op, but can be overridden if needed
Supported Call Types
CallTypes.rerank- Synchronous rerankCallTypes.arerank- Asynchronous rerank
Notes
- Only the query is processed by guardrails
- Documents are not processed (they represent the corpus, not user input)
- Output processing is a no-op since rankings don't contain text
- Both sync and async call types use the same handler
- Works with all rerank providers (Cohere, Together AI, etc.)
Common Patterns
PII Masking in Search
import litellm
response = litellm.rerank(
model="rerank-english-v3.0",
query="Find info about john@example.com",
documents=[
"Document 1 content.",
"Document 2 content.",
"Document 3 content."
],
guardrails=["mask_pii"],
top_n=2
)
# Query will have PII masked
# query becomes: "Find info about [EMAIL_REDACTED]"
print(response.results)
Content Filtering
import litellm
response = litellm.rerank(
model="rerank-english-v3.0",
query="Search query here",
documents=[
{"text": "Document 1 content", "id": "doc1"},
{"text": "Document 2 content", "id": "doc2"},
],
guardrails=["content_filter"],
)
Async Rerank with Guardrails
import litellm
import asyncio
async def rerank_with_guardrails():
response = await litellm.arerank(
model="rerank-english-v3.0",
query="Technical query",
documents=["Doc 1", "Doc 2", "Doc 3"],
guardrails=["sanitize"],
top_n=2
)
return response
result = asyncio.run(rerank_with_guardrails())