Threshold Calibration
Automatically determine optimal similarity thresholds for search and use per-model presets.
Overview
Different embedding models produce different similarity score distributions. A cosine similarity of 0.7 might represent a strong match for one model but a weak match for another. Choosing the right threshold for db.search() or semanticSearch() is critical for relevance filtering.
See it in action
Try Model Evaluator and Product Search for working demos of these APIs.
LocalMode provides two complementary features:
calibrateThreshold()-- Empirically calibrates a threshold from your actual corpus dataMODEL_THRESHOLD_PRESETS-- Known-good defaults for popular models when you need an instant answer
Both are entirely optional. Existing search behavior is unchanged when no threshold is provided.
Quick Start
import { calibrateThreshold, semanticSearch } from '@localmode/core';
import { transformers } from '@localmode/transformers';
const model = transformers.embedding('Xenova/bge-small-en-v1.5');
const corpus = ['document 1 text...', 'document 2 text...', /* ... */];
const { threshold } = await calibrateThreshold({
model,
corpus,
percentile: 90, // Filter below 90th percentile of similarity
});
// Use the calibrated threshold for search
const results = await semanticSearch({
db,
model,
query: 'How to configure auth?',
threshold,
});import { getDefaultThreshold, semanticSearch } from '@localmode/core';
import { transformers } from '@localmode/transformers';
const model = transformers.embedding('Xenova/bge-small-en-v1.5');
const threshold = getDefaultThreshold('Xenova/bge-small-en-v1.5');
// 0.5
const results = await semanticSearch({
db,
model,
query: 'How to configure auth?',
threshold, // undefined-safe: omitted if model not in presets
});calibrateThreshold()
Embeds a corpus sample, computes all pairwise similarity scores, and returns the score at a configurable percentile.
import { calibrateThreshold } from '@localmode/core';
const calibration = await calibrateThreshold({
model,
corpus: sampleTexts,
percentile: 90,
maxSamples: 200,
});
console.log(calibration.threshold); // 0.52
console.log(calibration.distribution.mean); // 0.38
console.log(calibration.distribution.stdDev); // 0.12
console.log(calibration.sampleSize); // 200CalibrateThresholdOptions
Prop
Type
ThresholdCalibration (Result)
Prop
Type
ThresholdDistributionStats
Prop
Type
Percentile Selection
The percentile parameter controls threshold strictness:
| Percentile | Behavior | Use Case |
|---|---|---|
| 70-80 | Permissive, more results | Exploratory search, broad recall |
| 90 (default) | Balanced | General semantic search |
| 95-99 | Strict, fewer but more precise results | High-precision applications |
The threshold is computed using the nearest-rank method: index = ceil(percentile / 100 * count) - 1, clamped to [0, count - 1].
MODEL_THRESHOLD_PRESETS
A static map of known-good cosine similarity thresholds for popular models:
import { MODEL_THRESHOLD_PRESETS } from '@localmode/core';
console.log(MODEL_THRESHOLD_PRESETS);
// {
// 'Xenova/bge-small-en-v1.5': 0.5,
// 'Xenova/bge-base-en-v1.5': 0.5,
// 'Xenova/all-MiniLM-L6-v2': 0.68,
// 'Xenova/all-MiniLM-L12-v2': 0.7,
// 'nomic-ai/nomic-embed-text-v1.5': 0.55,
// 'Xenova/gte-small': 0.6,
// 'Xenova/gte-base': 0.6,
// 'Xenova/e5-small-v2': 0.6,
// 'Xenova/paraphrase-MiniLM-L6-v2': 0.72,
// }Presets are approximate defaults for cosine similarity. For production use with domain-specific data, use calibrateThreshold() for a data-driven threshold.
getDefaultThreshold()
Safe lookup that returns undefined for unknown models:
import { getDefaultThreshold } from '@localmode/core';
const threshold = getDefaultThreshold('Xenova/bge-small-en-v1.5');
// 0.5
const unknown = getDefaultThreshold('unknown/model');
// undefinedThis is useful for conditional threshold application:
const threshold = getDefaultThreshold(model.modelId);
const results = await db.search(queryVector, {
k: 10,
...(threshold !== undefined && { threshold }),
});Distance Functions
By default, calibrateThreshold() uses cosine similarity. You can use other metrics:
// Default -- cosine similarity scores in [-1, 1]
const { threshold } = await calibrateThreshold({
model,
corpus,
distanceFunction: 'cosine',
});// Euclidean -- scores computed as 1 / (1 + distance), in (0, 1]
const { threshold } = await calibrateThreshold({
model,
corpus,
distanceFunction: 'euclidean',
});// Dot product -- raw dot product scores (any real number)
const { threshold } = await calibrateThreshold({
model,
corpus,
distanceFunction: 'dot',
});AbortSignal Support
Calibration supports cancellation via AbortSignal:
const controller = new AbortController();
setTimeout(() => controller.abort(), 10000); // 10s timeout
try {
const { threshold } = await calibrateThreshold({
model,
corpus,
abortSignal: controller.signal,
});
} catch (error) {
if (error.name === 'AbortError') {
console.log('Calibration cancelled');
}
}React Hook
The useCalibrateThreshold() hook from @localmode/react wraps calibrateThreshold() with React state management:
import { useCalibrateThreshold } from '@localmode/react';
function ThresholdCalibrator({ model, corpus }) {
const {
calibration,
isCalibrating,
error,
calibrate,
cancel,
clearError,
} = useCalibrateThreshold({ model, percentile: 90 });
return (
<div>
<button onClick={() => calibrate(corpus)} disabled={isCalibrating}>
{isCalibrating ? 'Calibrating...' : 'Calibrate Threshold'}
</button>
{isCalibrating && <button onClick={cancel}>Cancel</button>}
{calibration && (
<div>
<p>Threshold: {calibration.threshold.toFixed(4)}</p>
<p>Mean: {calibration.distribution.mean.toFixed(4)}</p>
<p>Samples: {calibration.sampleSize}</p>
</div>
)}
{error && <p>Error: {error.message}</p>}
</div>
);
}Performance
calibrateThreshold() computes O(n^2) pairwise similarities, capped by maxSamples:
| Samples | Pairs | Pairwise Time | Total (with embedding) |
|---|---|---|---|
| 50 | 1,225 | ~1ms | ~1-3s |
| 100 | 4,950 | ~2ms | ~2-5s |
| 200 (default) | 19,900 | ~5ms | ~3-10s |
| 500 | 124,750 | ~30ms | ~5-20s |
The embedding step dominates runtime. The pairwise computation is negligible for the default maxSamples of 200.
Integration with db.search()
The calibrated threshold integrates directly with the existing search API:
import { calibrateThreshold, createVectorDB, semanticSearch } from '@localmode/core';
// 1. Calibrate once at initialization
const { threshold } = await calibrateThreshold({ model, corpus });
// 2. Use with db.search()
const results = await db.search(queryVector, {
k: 10,
threshold, // Only results above this score are returned
});
// 3. Or with semanticSearch()
const { results: semanticResults } = await semanticSearch({
db,
model,
query: 'my search query',
threshold,
});calibrateThreshold() is purely additive. When no threshold is passed to db.search(), all top-k results are returned as before.