LocalMode
Core

Model Cache

Chunked model storage with background downloads, resumption, and LRU eviction for direct model file loading.

Model Cache

createModelLoader() provides a chunked model cache for downloading large model files directly from any URL — with resumable downloads, LRU eviction, and cross-tab coordination.

See it in action

Try PDF Search for a working demo of model caching with progress tracking.

When to use this vs provider caching: @localmode/transformers and @localmode/webllm manage their own model downloads and caching internally. You do NOT need createModelLoader for models loaded through those providers. Use it when you are loading model files directly from URLs — such as custom ONNX models, self-hosted models, GGUF files, or models for future providers like WebNN.

When to Use

ScenarioUse createModelLoader?
Loading models via @localmode/transformersNo — Transformers.js has its own Cache API caching
Loading models via @localmode/webllmNo — WebLLM has its own download and cache system
Loading raw ONNX files from a custom URLYes
Self-hosted models on your own CDN (enterprise)Yes
Loading GGUF files for a wllama/llama.cpp providerYes
Future @localmode/webnn provider (direct ONNX loading)Yes
Any large binary file (>50MB) you need to cache in the browserYes

Overview

Large model files (50MB-4GB) downloaded directly via fetch() have several problems in browsers:

  • No resume — Network interruption means restarting from scratch
  • IndexedDB size limits — Some browsers limit per-value size (~50MB Firefox, ~500MB Safari)
  • No eviction — Storage fills up silently until quota errors
  • Tab conflicts — Multiple tabs can download the same file simultaneously

The model cache solves all of these:

  • Chunked storage — 16MB chunks in IndexedDB (works within all browser limits)
  • Resumable downloads — HTTP Range headers resume from the last complete chunk
  • LRU eviction — Automatically evicts least-recently-used models when storage is low
  • Cross-tab coordination — Web Locks ensure only one tab downloads a model; BroadcastChannel notifies others
  • Offline-aware — Pauses when offline, resumes when back online

Usage

import { createModelLoader } from '@localmode/core';

const loader = await createModelLoader({
  maxCacheSize: '2GB',
  onProgress: (modelId, progress) => {
    console.log(`${modelId}: ${Math.round(progress.progress * 100)}%`);
  },
});

// Download a model file from any URL
await loader.prefetchOne('https://your-cdn.com/models/custom-model.onnx');

// Check if cached
const cached = await loader.isModelCached('custom-model');

// Reassemble cached model as a Blob
const blob = await loader.getBlob('custom-model');
if (blob) {
  // Use the blob: load into ONNX Runtime, pass to WebNN, etc.
  const arrayBuffer = await blob.arrayBuffer();
}

// Evict when no longer needed
await loader.evict('custom-model');

// Clean up
await loader.destroy();

Configuration

OptionTypeDefaultDescription
cacheNamestring'localmode-model-cache'IndexedDB database name
maxCacheSizestring | number'2GB'Maximum total cache size (supports '500MB', '2GB', etc.)
chunkSizenumber16777216 (16MB)Size per IndexedDB chunk
evictionStrategy'lru''lru'Eviction strategy
maxRetriesnumber3Retry attempts per failed chunk
retryDelayMsnumber1000Base retry delay (exponential backoff)
onProgressfunctionProgress callback

React Hook

import { useModelLoader } from '@localmode/react';

function ModelManager() {
  const { downloads, isDownloading, prefetch, cancel, evict, cacheStatus } = useModelLoader({
    maxCacheSize: '2GB',
  });

  return (
    <div>
      <button onClick={() => prefetch([{
        url: 'https://your-cdn.com/models/model.onnx',
        modelId: 'my-model'
      }])}>
        Download Model
      </button>
      {isDownloading && <p>Downloading...</p>}
    </div>
  );
}

For the full React hook API, see the useModelLoader section in the React utilities docs.

Events

The model loader emits events on globalEventBus:

EventDataWhen
modelDownloadStart{ modelId, totalBytes }Download begins
modelDownloadProgress{ modelId, downloadedBytes, totalBytes, progress }After each chunk
modelDownloadComplete{ modelId, totalBytes, durationMs }Download finished
modelDownloadError{ modelId, error }Download failed

These events are automatically picked up by the DevTools widget if enabled.

Browser Compatibility

BrowserChunked CacheResumableCross-Tab
Chrome 80+YesYesYes
Edge 80+YesYesYes
Firefox 75+YesYesYes
Safari 14+Yes (except Private Browsing)YesPartial

Safari Private Browsing blocks IndexedDB — the model loader degrades gracefully, skipping caching without errors.

  • Storage — IndexedDB and memory storage for VectorDB
  • Sync — Cross-tab coordination primitives used by the model cache
  • Network — Network status detection used for offline pause/resume

Showcase Apps

AppDescriptionLinks
PDF SearchModel preloading with progress UI and cache managementDemo · Source

On this page