LocalMode
AI SDK

Overview

Use LocalMode models with the Vercel AI SDK — generateText, streamText, and embed with models running in the browser.

@localmode/ai-sdk

Use the Vercel AI SDK interface you already know with models running entirely in the browser. No servers, no API keys.

Features

  • Use generateText(), streamText(), and embed() from the ai package with local models
  • Swap between local and cloud models by changing one line
  • Full streaming support with local LLMs via WebLLM
  • All inference happens on-device — data never leaves the browser

Installation

bash pnpm install @localmode/ai-sdk @localmode/core ai
bash npm install @localmode/ai-sdk @localmode/core ai
bash yarn add @localmode/ai-sdk @localmode/core ai
bash bun add @localmode/ai-sdk @localmode/core ai

Plus at least one LocalMode provider:

bash pnpm install @localmode/webllm
bash pnpm install @localmode/transformers

Quick Start

Create a Provider

import { createLocalMode } from '@localmode/ai-sdk';
import { webllm } from '@localmode/webllm';
import { transformers } from '@localmode/transformers';

const localmode = createLocalMode({
  models: {
    'llama': webllm.languageModel('Llama-3.2-1B-Instruct-q4f16_1-MLC'),
    'embedder': transformers.embedding('Xenova/bge-small-en-v1.5'),
  },
});

Text Generation

import { generateText } from 'ai';

const { text } = await generateText({
  model: localmode.languageModel('llama'),
  prompt: 'Explain quantum computing in simple terms',
});

Streaming

import { streamText } from 'ai';

const result = streamText({
  model: localmode.languageModel('llama'),
  prompt: 'Write a short story about a robot',
});

for await (const chunk of result.textStream) {
  process.stdout.write(chunk);
}

Embeddings

import { embed } from 'ai';

const { embedding } = await embed({
  model: localmode.embeddingModel('embedder'),
  value: 'Hello world',
});

Provider Pattern

The provider follows the standard AI SDK provider convention. It is callable as a function and also exposes named methods:

// Callable — returns LanguageModelV3
localmode('llama');

// Named methods
localmode.languageModel('llama');
localmode.embeddingModel('embedder');

How It Works

createLocalMode() accepts a map of model IDs to pre-configured LocalMode model instances. When you call localmode.languageModel('llama'), it wraps the LocalMode model as an AI SDK LanguageModelV3 that works with generateText() and streamText().

┌──────────────────────────────────────────┐
│  AI SDK (generateText, streamText, embed) │
└────────────────────┬─────────────────────┘

            @localmode/ai-sdk
          (adapter / bridge layer)

┌────────────────────┴─────────────────────┐
│  LocalMode models (webllm, transformers)  │
│  Running entirely in the browser          │
└──────────────────────────────────────────┘

Switching Between Local and Cloud

One of the key benefits is the ability to swap between local and cloud models without changing your application code:

import { createLocalMode } from '@localmode/ai-sdk';
import { openai } from '@ai-sdk/openai';
import { generateText } from 'ai';

// Local model — runs in the browser, no API key
const model = localmode.languageModel('llama');

// Cloud model — uses OpenAI API
// const model = openai('gpt-4o');

const { text } = await generateText({ model, prompt: 'Hello' });

Limitations

These limitations apply to the AI SDK adapter layer, not to LocalMode itself. You can always use LocalMode's native API directly for full functionality.

  • No tool calling — Local models (especially small ones) have limited tool-calling ability. The adapter returns text-only content.
  • No structured output / JSON mode — Not supported by LocalMode's current LanguageModel interface.
  • No image generation — LocalMode does not have a generative image model, so ImageModelV3 is not implemented.
  • WebGPU required for LLMs — WebLLM requires WebGPU (Chrome 113+, Edge 113+, Safari 18+).

API Reference

createLocalMode(options)

Creates a LocalMode AI SDK provider.

ParameterTypeDescription
options.modelsRecord<string, LanguageModel | EmbeddingModel>Map of model IDs to LocalMode model instances

Returns a LocalModeProvider with languageModel() and embeddingModel() methods.

LocalModeLanguageModel

Wraps a @localmode/core LanguageModel as an AI SDK LanguageModelV3. Supports doGenerate() and doStream().

LocalModeEmbeddingModel

Wraps a @localmode/core EmbeddingModel as an AI SDK EmbeddingModelV3. Converts Float32Array embeddings to number[] arrays.

On this page