LocalMode
Core

Capabilities

Detect device capabilities and provide appropriate fallbacks.

LocalMode provides utilities to detect device capabilities and choose appropriate fallbacks.

See it in action

Try LLM Chat and PDF Search for working demos of these APIs.

Full Capability Report

Get a comprehensive report of available features:

import { detectCapabilities } from '@localmode/core';

const capabilities = await detectCapabilities();

Capabilities Object

Prop

Type

Feature Detection Functions

All functions are available from @localmode/core:

import {
  isWebGPUSupported,
  isWebNNSupported,
  isWASMSupported,
  isWASMSIMDSupported,
  isWASMThreadsSupported,
  isIndexedDBSupported,
  isWebWorkersSupported,
  isSharedArrayBufferSupported,
  isCrossOriginIsolated,
  isOPFSSupported,
  isBroadcastChannelSupported,
  isWebLocksSupported,
  isServiceWorkerSupported,
  isWebCryptoSupported,
} from '@localmode/core';

Complete Reference

FunctionReturn TypeDescription
isWebGPUSupported()Promise<boolean>WebGPU for GPU-accelerated inference (async — requests adapter)
isWebNNSupported()booleanWebNN for hardware-accelerated ML
isWASMSupported()booleanWebAssembly support
isWASMSIMDSupported()booleanWASM SIMD instructions (2-4x faster vector ops)
isWASMThreadsSupported()booleanWASM multi-threading (requires SharedArrayBuffer)
isIndexedDBSupported()booleanIndexedDB for persistent storage
isWebWorkersSupported()booleanWeb Workers for background processing
isSharedArrayBufferSupported()booleanSharedArrayBuffer for shared memory
isCrossOriginIsolated()booleanCross-origin isolation (required for SharedArrayBuffer)
isOPFSSupported()booleanOrigin Private File System
isBroadcastChannelSupported()booleanBroadcastChannel for cross-tab messaging
isWebLocksSupported()booleanWeb Locks API for cross-tab coordination
isServiceWorkerSupported()booleanService Workers for offline support
isWebCryptoSupported()booleanWeb Crypto API for encryption

isWebGPUSupported() vs isWebGPUAvailable()

isWebGPUSupported() from @localmode/core is an async capability check that requests a GPU adapter.

isWebGPUAvailable() from @localmode/transformers (or @localmode/webllm) is a provider-specific async check.

Use isWebGPUSupported() for general capability detection. Use isWebGPUAvailable() when working with a specific provider package.

Quick Access Object

The features object provides synchronous access to all checks (except WebGPU which is async):

import { features } from '@localmode/core';

if (features.wasm && features.simd) {
  console.log('Fast WASM execution available');
}

Runtime Detection

import { runtime } from '@localmode/core';

if (runtime.isBrowser) { /* browser environment */ }
if (runtime.isNode) { /* Node.js environment */ }
if (runtime.isWebWorker) { /* Web Worker context */ }
if (runtime.isElectron) { /* Electron app */ }

Device Detection

Detect hardware and environment information:

import {
  detectBrowser,
  detectOS,
  detectDeviceType,
  detectGPU,
  getDeviceInfo,
  getMemoryInfo,
  getHardwareConcurrency,
  getStorageEstimate,
} from '@localmode/core';
FunctionReturn TypeDescription
detectBrowser(){ name: string; version: string; engine: string }Browser name, version, and engine
detectOS(){ name: string; version: string }Operating system name and version
detectDeviceType()'desktop' | 'mobile' | 'tablet' | 'unknown'Device form factor
detectGPU(){ vendor: string; renderer: string } | nullGPU vendor and renderer via WebGL
getDeviceInfo()DeviceInfoCombined device information
getMemoryInfo()MemoryInfoDevice memory information
getHardwareConcurrency()numberNumber of logical CPU cores
getStorageEstimate()Promise<{ quota; usage; persisted }>Storage quota and usage
const browser = detectBrowser();
console.log(`${browser.name} ${browser.version} (${browser.engine})`);

const gpu = detectGPU();
if (gpu) {
  console.log(`GPU: ${gpu.vendor} — ${gpu.renderer}`);
}

const cores = getHardwareConcurrency();
console.log(`CPU cores: ${cores}`);

const storage = await getStorageEstimate();
console.log(`Storage: ${storage?.usage}/${storage?.quota} bytes used`);

Capability Reports

Generate a comprehensive browser capability summary:

import { createCapabilityReport, formatCapabilityReport } from '@localmode/core';

const report = await createCapabilityReport();
const formatted = formatCapabilityReport(report);
console.log(formatted);

The report includes browser info, feature support, hardware details, and storage status — useful for diagnostics and support.

Model Support Check

Check if a specific model is supported:

import { checkModelSupport } from '@localmode/core';

const support = await checkModelSupport('Llama-3.2-1B-Instruct-q4f16_1-MLC');

if (support.supported) {
  console.log('Model can run on this device');
} else {
  console.log('Issues:', support.issues);
  // ['Insufficient GPU memory', 'WebGPU not available']
}

Get fallback recommendations:

import { getRecommendedFallbacks } from '@localmode/core';

const fallbacks = await getRecommendedFallbacks();

console.log(fallbacks);
// {
//   embedding: 'Xenova/bge-small-en-v1.5',      // Smaller model for limited devices
//   llm: 'SmolLM2-1.7B-Instruct-q4f16_1-MLC', // Compact LLM
//   storage: 'memory',                         // If IndexedDB unavailable
//   compute: 'wasm',                           // If WebGPU unavailable
// }

Capability-Based Model Selection

Choose models based on device capabilities:

import { detectCapabilities } from '@localmode/core';
import { transformers } from '@localmode/transformers';
import { webllm } from '@localmode/webllm';

const capabilities = await detectCapabilities();

// Choose embedding model
const embeddingModel = capabilities.webgpu
  ? transformers.embedding('Xenova/all-MiniLM-L12-v2') // Larger, better
  : transformers.embedding('Xenova/bge-small-en-v1.5'); // Smaller, faster

// Choose LLM
let llm;
if (capabilities.webgpu && capabilities.memory?.available > 2048) {
  llm = webllm.languageModel('Llama-3.2-3B-Instruct-q4f16_1-MLC');
} else if (capabilities.webgpu) {
  llm = webllm.languageModel('Llama-3.2-1B-Instruct-q4f16_1-MLC');
} else {
  console.warn('WebGPU not available, LLM features disabled');
  llm = null;
}

Browser Compatibility

FeatureChromeEdgeFirefoxSafari
WebGPU113+113+Nightly18+
WASM80+80+75+14+
IndexedDB✅*
Web Workers⚠️
Web Locks15.4+
SharedArrayBuffer✅**✅**✅**✅**

Notes

  • Safari private browsing blocks IndexedDB ** Requires cross-origin isolation headers

Handling Limited Devices

Gracefully handle limited capabilities:

import { detectCapabilities, isWebGPUSupported } from '@localmode/core';

async function initializeAI() {
  const capabilities = await detectCapabilities();
  const features = {
    embeddings: true,
    vectorSearch: true,
    llm: false,
    persistence: true,
  };

  // Check WebGPU for LLM
  if (!isWebGPUSupported()) {
    console.warn('WebGPU not available. LLM features disabled.');
    features.llm = false;
  } else if (capabilities.memory?.available < 1024) {
    console.warn('Low GPU memory. LLM may be slow.');
  } else {
    features.llm = true;
  }

  // Check IndexedDB for persistence
  if (!capabilities.indexedDB) {
    console.warn('IndexedDB not available. Data will not persist.');
    features.persistence = false;
  }

  return features;
}

// Usage
const features = await initializeAI();

if (features.llm) {
  // Show LLM features in UI
} else {
  // Hide or disable LLM features
}

Device Information

Get GPU and memory information:

import { detectCapabilities } from '@localmode/core';

const { gpu, memory } = await detectCapabilities();

if (gpu) {
  console.log('GPU Vendor:', gpu.vendor);
  console.log('GPU Renderer:', gpu.renderer);
}

if (memory) {
  console.log('Total Memory:', memory.total, 'MB');
  console.log('Available Memory:', memory.available, 'MB');
}

Adaptive Batch Size

Compute an optimal batch size for streamEmbedMany() or ingest() based on device hardware:

import { computeOptimalBatchSize } from '@localmode/core';

const { batchSize, reasoning, deviceProfile } = computeOptimalBatchSize({
  taskType: 'embedding',
  modelDimensions: 384,
});

console.log(batchSize);     // e.g., 32 on a 4-core/8GB device
console.log(reasoning);     // Human-readable explanation
console.log(deviceProfile); // { cores, memoryGB, hasGPU, source }

Formula

The batch size is computed as:

Math.max(min, Math.min(max, Math.floor(base * (cores / 4) * (memGB / 8) * gpuMult)))

The formula normalizes against a reference device (4 cores, 8 GB RAM). Devices with more resources scale up linearly; devices with fewer scale down. A 1.5x GPU bonus is applied when a GPU is detected.

Task-Type Defaults

Task TypeBaseMinMax
embedding324256
ingestion648512

Options

Prop

Type

Device Override for Testing

Pass deviceCapabilities for deterministic results in tests or SSR:

const result = computeOptimalBatchSize({
  taskType: 'embedding',
  modelDimensions: 384,
  deviceCapabilities: { cores: 16, memoryGB: 32, hasGPU: true },
});
// batchSize: 256 (clamped to max)

Integration with streamEmbedMany

Use the adaptiveBatching flag to opt in:

import { streamEmbedMany } from '@localmode/core';

for await (const { embedding, index } of streamEmbedMany({
  model,
  values: largeArray,
  adaptiveBatching: true, // Computes batch size from device capabilities
})) {
  await db.add({ id: `doc-${index}`, vector: embedding });
}

An explicit batchSize always takes precedence over adaptive computation.

Integration with ingest

import { ingest } from '@localmode/core';

await ingest(db, documents, {
  adaptiveBatching: true,
  chunking: { strategy: 'recursive', size: 500 },
});

React Hook

import { useAdaptiveBatchSize } from '@localmode/react';

function BatchConfig() {
  const { batchSize, reasoning, deviceProfile } = useAdaptiveBatchSize({
    taskType: 'embedding',
    modelDimensions: 384,
  });

  return <p>Optimal batch size: {batchSize} ({deviceProfile.cores} cores)</p>;
}

Browser Compatibility

navigator.deviceMemory is only available in Chrome. On Firefox and Safari, the memory factor defaults to 1.0 (assumes 8 GB), producing the base batch size. navigator.hardwareConcurrency may be capped by browser privacy features, which produces conservative (but valid) batch sizes.

Model Registry

The model registry is a curated catalog of popular models across all providers (transformers, webllm, wllama, chrome-ai). Use recommendModels() to get ranked model suggestions for any task based on what the device can actually run.

Get Recommendations

import { detectCapabilities, recommendModels } from '@localmode/core';

const caps = await detectCapabilities();
const recs = recommendModels(caps, { task: 'embedding' });

for (const rec of recs) {
  console.log(`${rec.entry.name} — score: ${rec.score}`);
  console.log(`  Provider: ${rec.entry.provider}`);
  console.log(`  Size: ${rec.entry.sizeMB} MB`);
  console.log(`  Reasons: ${rec.reasons.join(', ')}`);
}

recommendModels() is a synchronous pure function. Call detectCapabilities() first (async) and pass the result.

RecommendationOptions

Prop

Type

ModelRecommendation Result

Prop

Type

Filter by Provider and Size

const recs = recommendModels(caps, {
  task: 'generation',
  providers: ['webllm'],
  maxSizeMB: 1000,
  limit: 3,
});

Register Custom Models

Extend the registry at runtime with registerModel():

import { registerModel, recommendModels } from '@localmode/core';

registerModel({
  modelId: 'custom/my-embedder',
  provider: 'custom',
  task: 'embedding',
  name: 'My Custom Embedder',
  sizeMB: 50,
  dimensions: 384,
  recommendedDevice: 'wasm',
  speedTier: 'fast',
  qualityTier: 'medium',
});

// Custom model now appears in recommendations
const recs = recommendModels(caps, { task: 'embedding' });

Browse the Registry

import { getModelRegistry, DEFAULT_MODEL_REGISTRY } from '@localmode/core';

// Default curated catalog (read-only)
console.log(`${DEFAULT_MODEL_REGISTRY.length} curated models`);

// Full registry (defaults + custom)
const all = getModelRegistry();
const embeddingModels = all.filter(e => e.task === 'embedding');

Registry is a Starting Point

The curated catalog covers popular models from all providers. Use registerModel() to add custom or newly released models without waiting for a core update. Provider packages can also self-register their full catalogs at app startup.

Scoring Algorithm

Recommendations are scored using a weighted composite:

FactorWeightDescription
Device Fit50%Storage headroom, memory headroom, device match (WebGPU/WASM/CPU)
Quality Tier30%high > medium > low based on published benchmarks
Speed Tier20%fast > medium > slow, with mobile bonus for fast models

Models that exceed available storage or device memory are excluded entirely.

React Hook

import { useModelRecommendations } from '@localmode/react';

function ModelPicker() {
  const { recommendations, isLoading, error, refresh } = useModelRecommendations({
    task: 'embedding',
    limit: 3,
  });

  if (isLoading) return <p>Detecting device...</p>;
  if (error) return <p>Error: {error.message}</p>;

  return (
    <ul>
      {recommendations.map((rec) => (
        <li key={rec.entry.modelId}>
          {rec.entry.name} — score: {rec.score}
        </li>
      ))}
    </ul>
  );
}

TaskCategory

All supported task categories:

embedding | classification | zero-shot | ner | reranking | generation | translation | summarization | fill-mask | question-answering | speech-to-text | text-to-speech | image-classification | image-captioning | object-detection | segmentation | ocr | document-qa | image-features | image-to-image | multimodal-embedding

Best Practices

Capability Tips

  1. Check early - Detect capabilities at app startup
  2. Provide fallbacks - Always have a fallback for each feature
  3. Inform users - Show warnings for limited functionality
  4. Test everywhere - Test on various devices and browsers
  5. Graceful degradation - Core features should work everywhere
  6. Use recommendModels() - Let the registry pick the best model for the device instead of hard-coding model IDs

Next Steps

Showcase Apps

AppDescriptionLinks
LLM ChatWebGPU detection for optimal model backend selectionDemo · Source
PDF SearchFeature detection for WebGPU-accelerated searchDemo · Source

On this page