LocalMode
Core

Reranking

Improve RAG accuracy by reranking retrieved documents

Reranking improves the accuracy of RAG (Retrieval-Augmented Generation) pipelines by re-scoring documents based on their relevance to a query. After initial vector search retrieves candidates, reranking provides more precise ordering.

Why Rerank?

Vector search retrieves documents based on embedding similarity, but rerankers use cross-attention to directly score query-document pairs, often producing more accurate rankings for the final generation step.

Typical RAG pipeline:

  1. Retrieve — Get 20-50 candidates via vector search (fast, approximate)
  2. Rerank — Score and reorder candidates (precise, slower)
  3. Generate — Use top 5-10 documents for LLM context

Basic Usage

import { rerank } from '@localmode/core';
import { transformers } from '@localmode/transformers';

// Create reranker model
const rerankerModel = transformers.reranker('Xenova/ms-marco-MiniLM-L-6-v2');

const { results } = await rerank({
  model: rerankerModel,
  query: 'What is machine learning?',
  documents: [
    'Machine learning is a type of artificial intelligence...',
    'Cooking pasta requires boiling water...',
    'Deep learning is a subset of machine learning...',
  ],
  topK: 2,
});

// results: [
//   { index: 0, score: 0.95, text: 'Machine learning is a type of...' },
//   { index: 2, score: 0.88, text: 'Deep learning is a subset of...' }
// ]

API Reference

rerank(options)

Reranks documents by relevance to a query.

Prop

Type

Return Type: RerankResult

Prop

Type

RankedDocument

Prop

Type

For recommended models, provider-specific options, and practical recipes, see the Transformers Reranking guide.

Cancellation Support

All reranking operations support AbortSignal for cancellation:

const controller = new AbortController();

// Cancel after 5 seconds
setTimeout(() => controller.abort(), 5000);

try {
  const { results } = await rerank({
    model: rerankerModel,
    query: 'What is AI?',
    documents: largeDocumentSet,
    abortSignal: controller.signal,
  });
} catch (error) {
  if (error.name === 'AbortError') {
    console.log('Reranking was cancelled');
  }
}

Custom Reranker Implementation

Implement the RerankerModel interface to create custom rerankers:

import type { RerankerModel, DoRerankOptions, DoRerankResult } from '@localmode/core';

class MyCustomReranker implements RerankerModel {
  readonly modelId = 'custom:my-reranker';
  readonly provider = 'custom';

  async doRerank(options: DoRerankOptions): Promise<DoRerankResult> {
    const { query, documents, topK } = options;

    // Your scoring logic here
    const scored = documents.map((doc, index) => ({
      index,
      score: this.scoreDocument(query, doc),
      text: doc,
    }));

    // Sort by score descending
    scored.sort((a, b) => b.score - a.score);

    // Apply topK
    const results = topK ? scored.slice(0, topK) : scored;

    return {
      results,
      usage: {
        inputTokens: query.length + documents.join('').length,
        durationMs: 0,
      },
    };
  }

  private scoreDocument(query: string, document: string): number {
    // Implement your scoring logic
    return 0.5;
  }
}

On this page