Document Reranking

Improve search quality by reranking documents based on their relevance to a query. Use this as a second-stage ranker after initial retrieval.

Endpoint

POST https://api.assisters.dev/v1/rerank

Request Body

stringrequired

The reranking model to use. See available models.

Example: assisters-rerank-v1

stringrequired

The search query to rank documents against.

arrayrequired

Array of documents to rerank. Each can be a string or an object with a text field.

Maximum: 1000 documents per request.

integer

Return only the top N results. Defaults to returning all documents.

booleandefault: false

Whether to include the document text in the response.

Request Examples

Basic Reranking

import requests

response = requests.post(
    "https://api.assisters.dev/v1/rerank",
    headers={
        "Authorization": "Bearer ask_your_api_key",
        "Content-Type": "application/json"
    },
    json={
        "model": "assisters-rerank-v1",
        "query": "What is machine learning?",
        "documents": [
            "Machine learning is a subset of artificial intelligence.",
            "The weather today is sunny and warm.",
            "Deep learning uses neural networks.",
            "Cats are popular pets worldwide."
        ]
    }
)

results = response.json()["results"]
for r in results:
    print(f"Score: {r['relevance_score']:.4f} - {r['document'][:50]}...")

const response = await fetch('https://api.assisters.dev/v1/rerank', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer ask_your_api_key',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    model: 'assisters-rerank-v1',
    query: 'What is machine learning?',
    documents: [
      'Machine learning is a subset of artificial intelligence.',
      'The weather today is sunny and warm.',
      'Deep learning uses neural networks.',
      'Cats are popular pets worldwide.'
    ]
  })
});

const data = await response.json();
data.results.forEach(r => {
  console.log(`Score: ${r.relevance_score.toFixed(4)} - ${r.document.slice(0, 50)}...`);
});

curl https://api.assisters.dev/v1/rerank \
  -H "Authorization: Bearer ask_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "assisters-rerank-v1",
    "query": "What is machine learning?",
    "documents": [
      "Machine learning is a subset of artificial intelligence.",
      "The weather today is sunny and warm.",
      "Deep learning uses neural networks.",
      "Cats are popular pets worldwide."
    ]
  }'

With Top N

response = requests.post(
    "https://api.assisters.dev/v1/rerank",
    headers={"Authorization": "Bearer ask_your_api_key"},
    json={
        "model": "assisters-rerank-v1",
        "query": "Python programming",
        "documents": documents,
        "top_n": 3  # Only return top 3 results
    }
)

Two-Stage Retrieval

# Stage 1: Fast retrieval with embeddings
query_embedding = embed(query)
candidates = vector_db.search(query_embedding, limit=100)

# Stage 2: Precise reranking
reranked = rerank(
    model="assisters-rerank-v1",
    query=query,
    documents=[c.text for c in candidates],
    top_n=10
)

# Return the top reranked results
final_results = reranked["results"]

Response

{
  "model": "assisters-rerank-v1",
  "results": [
    {
      "index": 0,
      "relevance_score": 0.9823,
      "document": "Machine learning is a subset of artificial intelligence."
    },
    {
      "index": 2,
      "relevance_score": 0.8156,
      "document": "Deep learning uses neural networks."
    },
    {
      "index": 1,
      "relevance_score": 0.0234,
      "document": "The weather today is sunny and warm."
    },
    {
      "index": 3,
      "relevance_score": 0.0089,
      "document": "Cats are popular pets worldwide."
    }
  ],
  "usage": {
    "total_tokens": 45
  }
}

Response Fields

modelstring

The model used for reranking

resultsarray

Array of reranked documents, sorted by relevance (highest first):

index: Original position in the input array
relevance_score: Relevance score between 0 and 1
document: The document text as a string (if return_documents is true)

usageobject

Token usage for billing: { total_tokens }

Available Models

Model	Description	Max Tokens	Price
`assisters-rerank-v1`	High-quality document reranker	8192	$0.02/M tokens

Reranking Model Details

See detailed model specifications

Use Cases

Best Practices

Two-Stage Retrieval

Use fast retrieval first, then rerank the top candidates

Limit Candidate Size

Rerank 50-100 candidates for best speed/quality tradeoff

Use for RAG

Rerank retrieved chunks before feeding to LLM

Score Thresholds

Filter results below a relevance threshold for quality

Performance Tips

Candidates	Latency	Recommendation
10-20	~100ms	Good for real-time
50-100	~300ms	Best quality/speed
100+	500ms+	Consider batching

Reranking

Document Reranking

Endpoint

Request Body

Request Examples

Basic Reranking

With Top N

Two-Stage Retrieval

Response

Response Fields

Available Models

Reranking Model Details

Use Cases

Best Practices

Two-Stage Retrieval

Limit Candidate Size

Use for RAG

Score Thresholds

Performance Tips

Error Responses

On this page

Reranking

Reranking Model Details

Search Quality Improvement

RAG Pipeline Enhancement

Cross-Encoder Scoring

Hybrid Search

Two-Stage Retrieval

Limit Candidate Size

Use for RAG

Score Thresholds

400 Bad Request - Too Many Documents

400 Bad Request - Empty Query

On this page