Reranking
Reorder documents by relevance to improve search quality
Document Reranking
Improve search quality by reranking documents based on their relevance to a query. Use this as a second-stage ranker after initial retrieval.
Endpoint
POST https://api.assisters.dev/v1/rerankRequest Body
stringrequiredThe reranking model to use. See available models.
Example: assisters-rerank-v1
stringrequiredThe search query to rank documents against.
arrayrequiredArray of documents to rerank. Each can be a string or an object with a text field.
Maximum: 1000 documents per request.
integerReturn only the top N results. Defaults to returning all documents.
booleandefault: falseWhether to include the document text in the response.
Request Examples
Basic Reranking
import requests
response = requests.post(
"https://api.assisters.dev/v1/rerank",
headers={
"Authorization": "Bearer ask_your_api_key",
"Content-Type": "application/json"
},
json={
"model": "assisters-rerank-v1",
"query": "What is machine learning?",
"documents": [
"Machine learning is a subset of artificial intelligence.",
"The weather today is sunny and warm.",
"Deep learning uses neural networks.",
"Cats are popular pets worldwide."
]
}
)
results = response.json()["results"]
for r in results:
print(f"Score: {r['relevance_score']:.4f} - {r['document'][:50]}...")const response = await fetch('https://api.assisters.dev/v1/rerank', {
method: 'POST',
headers: {
'Authorization': 'Bearer ask_your_api_key',
'Content-Type': 'application/json'
},
body: JSON.stringify({
model: 'assisters-rerank-v1',
query: 'What is machine learning?',
documents: [
'Machine learning is a subset of artificial intelligence.',
'The weather today is sunny and warm.',
'Deep learning uses neural networks.',
'Cats are popular pets worldwide.'
]
})
});
const data = await response.json();
data.results.forEach(r => {
console.log(`Score: ${r.relevance_score.toFixed(4)} - ${r.document.slice(0, 50)}...`);
});curl https://api.assisters.dev/v1/rerank \
-H "Authorization: Bearer ask_your_api_key" \
-H "Content-Type: application/json" \
-d '{
"model": "assisters-rerank-v1",
"query": "What is machine learning?",
"documents": [
"Machine learning is a subset of artificial intelligence.",
"The weather today is sunny and warm.",
"Deep learning uses neural networks.",
"Cats are popular pets worldwide."
]
}'With Top N
response = requests.post(
"https://api.assisters.dev/v1/rerank",
headers={"Authorization": "Bearer ask_your_api_key"},
json={
"model": "assisters-rerank-v1",
"query": "Python programming",
"documents": documents,
"top_n": 3 # Only return top 3 results
}
)Two-Stage Retrieval
# Stage 1: Fast retrieval with embeddings
query_embedding = embed(query)
candidates = vector_db.search(query_embedding, limit=100)
# Stage 2: Precise reranking
reranked = rerank(
model="assisters-rerank-v1",
query=query,
documents=[c.text for c in candidates],
top_n=10
)
# Return the top reranked results
final_results = reranked["results"]Response
{
"model": "assisters-rerank-v1",
"results": [
{
"index": 0,
"relevance_score": 0.9823,
"document": "Machine learning is a subset of artificial intelligence."
},
{
"index": 2,
"relevance_score": 0.8156,
"document": "Deep learning uses neural networks."
},
{
"index": 1,
"relevance_score": 0.0234,
"document": "The weather today is sunny and warm."
},
{
"index": 3,
"relevance_score": 0.0089,
"document": "Cats are popular pets worldwide."
}
],
"usage": {
"total_tokens": 45
}
}Response Fields
modelstringThe model used for reranking
resultsarrayArray of reranked documents, sorted by relevance (highest first):
index: Original position in the input arrayrelevance_score: Relevance score between 0 and 1document: The document text as a string (ifreturn_documentsis true)
usageobjectToken usage for billing: { total_tokens }
Available Models
| Model | Description | Max Tokens | Price |
|---|---|---|---|
assisters-rerank-v1 | High-quality document reranker | 8192 | $0.02/M tokens |
Reranking Model Details
See detailed model specifications
Use Cases
Best Practices
Two-Stage Retrieval
Use fast retrieval first, then rerank the top candidates
Limit Candidate Size
Rerank 50-100 candidates for best speed/quality tradeoff
Use for RAG
Rerank retrieved chunks before feeding to LLM
Score Thresholds
Filter results below a relevance threshold for quality
Performance Tips
| Candidates | Latency | Recommendation |
|---|---|---|
| 10-20 | ~100ms | Good for real-time |
| 50-100 | ~300ms | Best quality/speed |
| 100+ | 500ms+ | Consider batching |