Use LocalMode models with the Vercel AI SDK — generateText, streamText, and embed with models running in the browser.

@localmode/ai-sdk

Use the Vercel AI SDK interface you already know with models running entirely in the browser. No servers, no API keys.

Features

Use generateText(), streamText(), and embed() from the ai package with local models
Swap between local and cloud models by changing one line
Full streaming support with local LLMs via WebLLM
All inference happens on-device — data never leaves the browser

Installation

bash pnpm install @localmode/ai-sdk @localmode/core ai

bash npm install @localmode/ai-sdk @localmode/core ai

bash yarn add @localmode/ai-sdk @localmode/core ai

bash bun add @localmode/ai-sdk @localmode/core ai

Plus at least one LocalMode provider:

bash pnpm install @localmode/webllm

bash pnpm install @localmode/transformers

Quick Start

Create a Provider

import { createLocalMode } from '@localmode/ai-sdk';
import { webllm } from '@localmode/webllm';
import { transformers } from '@localmode/transformers';

const localmode = createLocalMode({
  models: {
    'llama': webllm.languageModel('Llama-3.2-1B-Instruct-q4f16_1-MLC'),
    'embedder': transformers.embedding('Xenova/bge-small-en-v1.5'),
  },
});

Text Generation

import { generateText } from 'ai';

const { text } = await generateText({
  model: localmode.languageModel('llama'),
  prompt: 'Explain quantum computing in simple terms',
});

Streaming

import { streamText } from 'ai';

const result = streamText({
  model: localmode.languageModel('llama'),
  prompt: 'Write a short story about a robot',
});

for await (const chunk of result.textStream) {
  process.stdout.write(chunk);
}

Embeddings

import { embed } from 'ai';

const { embedding } = await embed({
  model: localmode.embeddingModel('embedder'),
  value: 'Hello world',
});

Provider Pattern

The provider follows the standard AI SDK provider convention. It is callable as a function and also exposes named methods:

// Callable — returns LanguageModelV3
localmode('llama');

// Named methods
localmode.languageModel('llama');
localmode.embeddingModel('embedder');

createLocalMode() accepts a map of model IDs to pre-configured LocalMode model instances. When you call localmode.languageModel('llama'), it wraps the LocalMode model as an AI SDK LanguageModelV3 that works with generateText() and streamText().

┌──────────────────────────────────────────┐
│  AI SDK (generateText, streamText, embed) │
└────────────────────┬─────────────────────┘
                     │
            @localmode/ai-sdk
          (adapter / bridge layer)
                     │
┌────────────────────┴─────────────────────┐
│  LocalMode models (webllm, transformers)  │
│  Running entirely in the browser          │
└──────────────────────────────────────────┘

Switching Between Local and Cloud

One of the key benefits is the ability to swap between local and cloud models without changing your application code:

import { createLocalMode } from '@localmode/ai-sdk';
import { openai } from '@ai-sdk/openai';
import { generateText } from 'ai';

// Local model — runs in the browser, no API key
const model = localmode.languageModel('llama');

// Cloud model — uses OpenAI API
// const model = openai('gpt-4o');

const { text } = await generateText({ model, prompt: 'Hello' });

Limitations

These limitations apply to the AI SDK adapter layer, not to LocalMode itself. You can always use LocalMode's native API directly for full functionality.

No tool calling — Local models (especially small ones) have limited tool-calling ability. The adapter returns text-only content.
No structured output / JSON mode — Not supported by LocalMode's current LanguageModel interface.
No image generation — LocalMode does not have a generative image model, so ImageModelV3 is not implemented.
WebGPU required for LLMs — WebLLM requires WebGPU (Chrome 113+, Edge 113+, Safari 18+).

API Reference

`createLocalMode(options)`

Creates a LocalMode AI SDK provider.

Parameter	Type	Description
`options.models`	`Record<string, LanguageModel \| EmbeddingModel>`	Map of model IDs to LocalMode model instances

Returns a LocalModeProvider with languageModel() and embeddingModel() methods.

`LocalModeLanguageModel`

Wraps a @localmode/core LanguageModel as an AI SDK LanguageModelV3. Supports doGenerate() and doStream().

`LocalModeEmbeddingModel`

Wraps a @localmode/core EmbeddingModel as an AI SDK EmbeddingModelV3. Converts Float32Array embeddings to number[] arrays.

Overview