Core
Embeddings
Generate embeddings for text and perform semantic search.
Embeddings convert text into numerical vectors that capture semantic meaning. Use them for similarity search, clustering, and RAG applications.
embed()
Generate an embedding for a single value:
import { embed } from '@localmode/core';
import { transformers } from '@localmode/transformers';
const model = transformers.embedding('Xenova/all-MiniLM-L6-v2');
const { embedding, usage, response } = await embed({
model,
value: 'Hello, world!',
});
console.log('Dimensions:', embedding.length); // 384
console.log('Tokens:', usage.tokens); // 4
console.log('Model:', response.modelId); // 'Xenova/all-MiniLM-L6-v2'const controller = new AbortController();
setTimeout(() => controller.abort(), 5000); // Cancel after 5s
const { embedding } = await embed({
model,
value: 'Hello, world!',
abortSignal: controller.signal,
});EmbedOptions
Prop
Type
EmbedResult
Prop
Type
embedMany()
Generate embeddings for multiple values efficiently:
import { embedMany } from '@localmode/core';
const { embeddings, usage } = await embedMany({
model,
values: ['Hello', 'World', 'AI', 'Machine Learning'],
});
console.log('Count:', embeddings.length); // 4
console.log('Total tokens:', usage.tokens); // ~8const { embeddings } = await embedMany({
model,
values: largeArrayOfTexts,
onProgress: (progress) => {
console.log(`Processed ${progress.completed}/${progress.total}`);
},
});const controller = new AbortController();
// Cancel after 5 seconds
setTimeout(() => controller.abort(), 5000);
try {
const { embeddings } = await embedMany({
model,
values: largeArray,
abortSignal: controller.signal,
});
} catch (error) {
if (error.name === 'AbortError') {
console.log('Operation cancelled');
}
}EmbedManyOptions
Prop
Type
streamEmbedMany()
Stream embeddings as they're generated:
import { streamEmbedMany } from '@localmode/core';
const stream = streamEmbedMany({
model,
values: texts,
});
for await (const { index, embedding } of stream) {
console.log(`Embedding ${index}:`, embedding.length);
}semanticSearch()
Search for semantically similar documents:
import { semanticSearch, createVectorDB } from '@localmode/core';
const db = await createVectorDB({ name: 'docs', dimensions: 384 });
// Add documents to the database first...
const results = await semanticSearch({
db,
model,
query: 'What is machine learning?',
k: 5,
});
results.forEach((result) => {
console.log(`Score: ${result.score.toFixed(3)}`);
console.log(`Text: ${result.metadata.text}`);
});With Filters
const results = await semanticSearch({
db,
model,
query: 'AI applications',
k: 5,
filter: {
category: { $eq: 'technology' },
year: { $gte: 2023 },
},
});Options
interface SemanticSearchOptions {
db: VectorDB;
model: EmbeddingModel;
query: string;
k?: number;
filter?: FilterExpression;
abortSignal?: AbortSignal;
}Distance Functions
Compare vectors directly:
import { cosineSimilarity, euclideanDistance, dotProduct } from '@localmode/core';
const similarity = cosineSimilarity(embedding1, embedding2);
console.log('Similarity:', similarity); // 0.0 to 1.0
const distance = euclideanDistance(embedding1, embedding2);
console.log('Distance:', distance);
const dot = dotProduct(embedding1, embedding2);
console.log('Dot product:', dot);Middleware
Wrap embedding models with middleware for caching, logging, etc.:
import { wrapEmbeddingModel, cachingMiddleware, loggingMiddleware } from '@localmode/core';
const baseModel = transformers.embedding('Xenova/all-MiniLM-L6-v2');
const model = wrapEmbeddingModel(baseModel, [
cachingMiddleware({ maxSize: 1000 }),
loggingMiddleware({ logger: console.log }),
]);
// Now all embed calls will be cached and logged
const { embedding } = await embed({ model, value: 'Hello' });See Middleware for more details.
Implementing Custom Models
Create your own embedding model by implementing the EmbeddingModel interface:
import type { EmbeddingModel, DoEmbedOptions } from '@localmode/core';
class MyCustomEmbedder implements EmbeddingModel {
readonly modelId = 'custom:my-embedder';
readonly provider = 'custom';
readonly dimensions = 768;
readonly maxEmbeddingsPerCall = 100;
readonly supportsParallelCalls = true;
async doEmbed(options: DoEmbedOptions) {
const { values } = options;
// Your embedding logic here
const embeddings = values.map(() => new Float32Array(768));
return {
embeddings,
usage: { tokens: values.length * 10 },
};
}
}
// Use with core functions
const model = new MyCustomEmbedder();
const { embedding } = await embed({ model, value: 'Hello' });Best Practices
Performance Tips
- Batch embeddings - Use
embedMany()instead of multipleembed()calls - Use caching - Add
cachingMiddleware()for repeated queries - Choose the right model - Smaller models (MiniLM-L6) are faster, larger ones more accurate
- Preload models - Load models during app initialization
Recommended Models
| Model | Dimensions | Size | Use Case |
|---|---|---|---|
Xenova/all-MiniLM-L6-v2 | 384 | ~22MB | General purpose, fast |
Xenova/all-MiniLM-L12-v2 | 384 | ~33MB | Better accuracy |
Xenova/paraphrase-multilingual-MiniLM-L12-v2 | 384 | ~117MB | 50+ languages |