Overview
Use LocalMode models with the Vercel AI SDK — generateText, streamText, and embed with models running in the browser.
@localmode/ai-sdk
Use the Vercel AI SDK interface you already know with models running entirely in the browser. No servers, no API keys.
Features
- Use
generateText(),streamText(), andembed()from theaipackage with local models - Swap between local and cloud models by changing one line
- Full streaming support with local LLMs via WebLLM
- All inference happens on-device — data never leaves the browser
Installation
bash pnpm install @localmode/ai-sdk @localmode/core ai bash npm install @localmode/ai-sdk @localmode/core ai bash yarn add @localmode/ai-sdk @localmode/core ai bash bun add @localmode/ai-sdk @localmode/core ai Plus at least one LocalMode provider:
bash pnpm install @localmode/webllm bash pnpm install @localmode/transformers Quick Start
Create a Provider
import { createLocalMode } from '@localmode/ai-sdk';
import { webllm } from '@localmode/webllm';
import { transformers } from '@localmode/transformers';
const localmode = createLocalMode({
models: {
'llama': webllm.languageModel('Llama-3.2-1B-Instruct-q4f16_1-MLC'),
'embedder': transformers.embedding('Xenova/bge-small-en-v1.5'),
},
});Text Generation
import { generateText } from 'ai';
const { text } = await generateText({
model: localmode.languageModel('llama'),
prompt: 'Explain quantum computing in simple terms',
});Streaming
import { streamText } from 'ai';
const result = streamText({
model: localmode.languageModel('llama'),
prompt: 'Write a short story about a robot',
});
for await (const chunk of result.textStream) {
process.stdout.write(chunk);
}Embeddings
import { embed } from 'ai';
const { embedding } = await embed({
model: localmode.embeddingModel('embedder'),
value: 'Hello world',
});Provider Pattern
The provider follows the standard AI SDK provider convention. It is callable as a function and also exposes named methods:
// Callable — returns LanguageModelV3
localmode('llama');
// Named methods
localmode.languageModel('llama');
localmode.embeddingModel('embedder');How It Works
createLocalMode() accepts a map of model IDs to pre-configured LocalMode model instances. When you call localmode.languageModel('llama'), it wraps the LocalMode model as an AI SDK LanguageModelV3 that works with generateText() and streamText().
┌──────────────────────────────────────────┐
│ AI SDK (generateText, streamText, embed) │
└────────────────────┬─────────────────────┘
│
@localmode/ai-sdk
(adapter / bridge layer)
│
┌────────────────────┴─────────────────────┐
│ LocalMode models (webllm, transformers) │
│ Running entirely in the browser │
└──────────────────────────────────────────┘Switching Between Local and Cloud
One of the key benefits is the ability to swap between local and cloud models without changing your application code:
import { createLocalMode } from '@localmode/ai-sdk';
import { openai } from '@ai-sdk/openai';
import { generateText } from 'ai';
// Local model — runs in the browser, no API key
const model = localmode.languageModel('llama');
// Cloud model — uses OpenAI API
// const model = openai('gpt-4o');
const { text } = await generateText({ model, prompt: 'Hello' });Limitations
These limitations apply to the AI SDK adapter layer, not to LocalMode itself. You can always use LocalMode's native API directly for full functionality.
- No tool calling — Local models (especially small ones) have limited tool-calling ability. The adapter returns text-only content.
- No structured output / JSON mode — Not supported by LocalMode's current
LanguageModelinterface. - No image generation — LocalMode does not have a generative image model, so
ImageModelV3is not implemented. - WebGPU required for LLMs — WebLLM requires WebGPU (Chrome 113+, Edge 113+, Safari 18+).
API Reference
createLocalMode(options)
Creates a LocalMode AI SDK provider.
| Parameter | Type | Description |
|---|---|---|
options.models | Record<string, LanguageModel | EmbeddingModel> | Map of model IDs to LocalMode model instances |
Returns a LocalModeProvider with languageModel() and embeddingModel() methods.
LocalModeLanguageModel
Wraps a @localmode/core LanguageModel as an AI SDK LanguageModelV3. Supports doGenerate() and doStream().
LocalModeEmbeddingModel
Wraps a @localmode/core EmbeddingModel as an AI SDK EmbeddingModelV3. Converts Float32Array embeddings to number[] arrays.