LocalMode
Core

Middleware

Extend functionality with caching, logging, retry, and validation.

Middleware lets you extend and modify the behavior of embedding models and vector databases.

See it in action

Try Document Redactor for a working demo of these APIs.

Embedding Model Middleware

Wrap embedding models with middleware:

import { wrapEmbeddingModel, cachingMiddleware, loggingMiddleware } from '@localmode/core';
import { transformers } from '@localmode/transformers';

const baseModel = transformers.embedding('Xenova/bge-small-en-v1.5');

const model = wrapEmbeddingModel(baseModel, [
  cachingMiddleware({ maxSize: 1000 }),
  loggingMiddleware({ logger: console.log }),
]);

Available Middleware

Combining Middleware

Stack multiple middleware:

const model = wrapEmbeddingModel(baseModel, [
  validationMiddleware({ maxLength: 8192 }),
  piiRedactionMiddleware({ patterns: ['email', 'phone'] }),
  cachingMiddleware({ maxSize: 1000 }),
  retryMiddleware({ maxRetries: 3 }),
  loggingMiddleware({ logger: console.log }),
]);

Middleware Order

Middleware executes in order. Place validation first, caching before expensive operations, and logging last.

Vector DB Middleware

Wrap vector databases:

import { wrapVectorDB } from '@localmode/core';

const baseDB = await createVectorDB({ name: 'db', dimensions: 384 });

const db = wrapVectorDB(baseDB, {
  beforeAdd: async (docs) => {
    console.log('Adding', docs.length, 'documents');
    return docs;
  },
  afterAdd: async (docs) => {
    console.log('Added', docs.length, 'documents');
  },
  beforeSearch: async (vector, options) => {
    console.log('Searching with k =', options.k);
    return { vector, options };
  },
  afterSearch: async (results) => {
    console.log('Found', results.length, 'results');
    return results;
  },
  beforeDelete: async (id) => {
    console.log('Deleting', id);
    return id;
  },
  afterDelete: async () => {
    console.log('Deleted');
  },
});

Vector DB Middleware Interface

interface VectorDBMiddleware {
  beforeAdd?: (docs: Document[]) => Promise<Document[]>;
  afterAdd?: (docs: Document[]) => Promise<void>;
  beforeSearch?: (
    vector: Float32Array,
    options: SearchOptions
  ) => Promise<{ vector: Float32Array; options: SearchOptions }>;
  afterSearch?: (results: SearchResult[]) => Promise<SearchResult[]>;
  beforeDelete?: (id: string) => Promise<string>;
  afterDelete?: () => Promise<void>;
  beforeClear?: () => Promise<void>;
  afterClear?: () => Promise<void>;
}

Custom Middleware

Create your own middleware:

import type { EmbeddingModelMiddleware } from '@localmode/core';

function myCustomMiddleware(options: { threshold: number }): EmbeddingModelMiddleware {
  return {
    transformParams: async ({ values }) => {
      // Transform input values
      const filtered = values.filter((v) => v.length > options.threshold);
      return { values: filtered };
    },
    wrapEmbed: async ({ doEmbed, values, model }) => {
      const start = Date.now();

      // Call the actual embedding function
      const result = await doEmbed({ values });

      const duration = Date.now() - start;
      console.log(`Embedded ${values.length} values in ${duration}ms`);

      return result;
    },
  };
}

const model = wrapEmbeddingModel(baseModel, [myCustomMiddleware({ threshold: 10 })]);

Middleware Factory Functions

Factory functions create pre-configured middleware instances:

import {
  createCachingMiddleware,
  createLoggingMiddleware,
  createValidationMiddleware,
  createRetryMiddleware,
  createRateLimitMiddleware,
} from '@localmode/core';
FactoryOptions TypeDescription
createCachingMiddleware(options)CachingMiddlewareOptionsCaches search results with configurable TTL and max entries
createLoggingMiddleware(options)LoggingMiddlewareOptionsLogs VectorDB operations with custom logger
createValidationMiddleware(options)ValidationMiddlewareOptionsValidates vectors, metadata, and search params
createRetryMiddleware(options)RetryMiddlewareOptionsRetries failed operations with backoff
createRateLimitMiddleware(options)RateLimitMiddlewareOptionsRate limits operations with token bucket
// Example: Create caching middleware with custom TTL
const caching = createCachingMiddleware({
  maxEntries: 1000,
  ttlMs: 60_000, // 1 minute cache
});

// Example: Create retry middleware with backoff
const retry = createRetryMiddleware({
  maxRetries: 3,
  initialDelayMs: 100,
  backoffMultiplier: 2,
});

Middleware Composition

Combine multiple middleware into a single middleware using composition functions:

import { composeEmbeddingMiddleware, composeVectorDBMiddleware } from '@localmode/core';

Composing Embedding Middleware

import {
  composeEmbeddingMiddleware,
  piiRedactionMiddleware,
  encryptionMiddleware,
  wrapEmbeddingModel,
} from '@localmode/core';

const combined = composeEmbeddingMiddleware([
  piiRedactionMiddleware({ patterns: ['email', 'phone'] }),
  encryptionMiddleware({ key }),
]);

const secureModel = wrapEmbeddingModel(baseModel, [combined]);

Composing VectorDB Middleware

import {
  composeVectorDBMiddleware,
  createCachingMiddleware,
  createLoggingMiddleware,
  createValidationMiddleware,
  wrapVectorDB,
} from '@localmode/core';

const combined = composeVectorDBMiddleware([
  createValidationMiddleware({ maxDimensions: 1536 }),
  createCachingMiddleware({ maxEntries: 500 }),
  createLoggingMiddleware({ logger: console.log }),
]);

const wrappedDB = wrapVectorDB(db, [combined]);

Custom Middleware

Custom VectorDB Middleware

Implement the VectorDBMiddleware interface to create custom middleware:

import type { VectorDBMiddleware } from '@localmode/core';

function analyticsMiddleware(): VectorDBMiddleware {
  let searchCount = 0;
  let addCount = 0;

  return {
    beforeSearch: async ({ query, k }) => {
      searchCount++;
      console.log(`Search #${searchCount}: k=${k}`);
      return { query, k };
    },

    afterSearch: async ({ results, durationMs }) => {
      console.log(`Found ${results.length} results in ${durationMs}ms`);
      return results;
    },

    beforeAdd: async ({ documents }) => {
      addCount += documents.length;
      return { documents };
    },

    afterAdd: async () => {
      console.log(`Total documents added: ${addCount}`);
    },
  };
}

Available hooks: beforeAdd, afterAdd, beforeSearch, afterSearch, beforeDelete, afterDelete, beforeClear, afterClear.

Best Practices

Middleware Tips

  1. Order matters - Validation first, caching early, logging last
  2. Keep middleware focused - One concern per middleware
  3. Handle errors - Middleware can throw; handle gracefully
  4. Consider performance - Each middleware adds overhead
  5. Use composition - Use composeVectorDBMiddleware to combine related middleware
  6. Use factories - Prefer createCachingMiddleware() over cachingMiddleware() for configurable instances

Next Steps

Showcase Apps

AppDescriptionLinks
Document RedactorDifferential privacy middleware for embedding modelsDemo · Source

On this page