LocalMode
MediaPipe

Text Tasks

MediaPipe text tasks in the browser — language detection across 110 languages, semantic text embeddings, and custom-model text classification.

Text Tasks

@localmode/mediapipe wraps @mediapipe/tasks-text to provide three text tasks: language detection, text embeddings, and text classification. All run fully on-device via WebAssembly.

TaskFactoryCore functionDefault model in catalog
Language detectionmediapipe.languageDetector()detectLanguage()language_detector (315KB)
Text embeddingsmediapipe.textEmbedder()embed() / embedMany()text_embedder (6.1MB)
Text classificationmediapipe.textClassifier(modelPath)classify()none — custom model required

Language Detection

The language detector identifies the language of a text string across 110 languages, returning candidate languages with ISO 639-1 codes and confidence scores, sorted by confidence.

import { detectLanguage } from '@localmode/core';
import { mediapipe } from '@localmode/mediapipe';

const { languages, usage } = await detectLanguage({
  model: mediapipe.languageDetector(),
  text: 'Bonjour le monde, comment allez-vous?',
  maxResults: 3,
  minConfidence: 0.1,
});

for (const lang of languages) {
  console.log(`${lang.languageCode}: ${lang.confidence.toFixed(3)}`);
}
// e.g. fr: 0.987

console.log(`Detected in ${usage.durationMs.toFixed(0)}ms`);

Options

OptionTypeDefaultDescription
modelLanguageDetectionModelThe model from mediapipe.languageDetector()
textstringText to detect the language of
maxResultsnumber5Maximum language candidates to return
minConfidencenumber0Minimum confidence threshold (0–1)
abortSignalAbortSignalCancellation signal
maxRetriesnumber2Retry attempts on transient failure

Result

DetectLanguageResult contains a languages array of DetectedLanguage:

interface DetectedLanguage {
  /** ISO 639-1 language code (e.g., 'en', 'fr', 'zh') */
  languageCode: string;
  /** Detection confidence (0-1) */
  confidence: number;
}

To map a code to a human-readable name, @localmode/core exports SUPPORTED_LANGUAGES — an ISO 639-1 code-to-name map:

import { detectLanguage, SUPPORTED_LANGUAGES } from '@localmode/core';

const { languages } = await detectLanguage({
  model: mediapipe.languageDetector(),
  text: 'こんにちは',
});

const top = languages[0];
console.log(SUPPORTED_LANGUAGES[top.languageCode] ?? top.languageCode);
// e.g. 'Japanese'

React Hook

@localmode/react provides useDetectLanguage:

'use client';

import { useDetectLanguage } from '@localmode/react';
import { mediapipe } from '@localmode/mediapipe';

const model = mediapipe.languageDetector();

export function LanguageDetector() {
  const { data, error, isLoading, execute, cancel } = useDetectLanguage({ model });

  return (
    <div>
      <button onClick={() => execute('Hola, ¿cómo estás?')}>Detect</button>
      {isLoading && <button onClick={cancel}>Cancel</button>}
      {error && <p>{error.message}</p>}
      {data && <p>Top: {data.languages[0]?.languageCode}</p>}
    </div>
  );
}

The hook takes { model } and returns { data, error, isLoading, execute, cancel, reset }.

Text Embeddings

The text embedder produces semantic embedding vectors using the Universal Sentence Encoder — useful for semantic search, clustering, and similarity. It implements the standard EmbeddingModel interface, so it works with embed() and embedMany():

import { embed, embedMany } from '@localmode/core';
import { mediapipe } from '@localmode/mediapipe';

const model = mediapipe.textEmbedder();

// Single embedding
const { embedding } = await embed({
  model,
  value: 'Local-first AI in the browser',
});
console.log(embedding.length); // 100 — Universal Sentence Encoder output dimension

// Batch
const { embeddings } = await embedMany({
  model,
  values: ['First document', 'Second document', 'Third document'],
});

Because it is a standard EmbeddingModel, the text embedder also plugs straight into createVectorDB(), the semantic cache, RAG pipelines, and every other embedding-based core feature.

Picking an embedding provider

MediaPipe's text embedder is a compact, zero-config option. For a wider selection of embedding models (multilingual, higher-dimensional, retrieval-tuned), see @localmode/transformers.

Text Classification

mediapipe.textClassifier() implements the standard ClassificationModel interface and works with the core classify() function — but MediaPipe ships no default text classifier model.

A custom model is required

mediapipe.textClassifier() requires an explicit .tflite model URL. Calling it with no path throws a ValidationError. MediaPipe text classification is intended for custom-trained models built with MediaPipe Model Maker — train a classifier on your own labels (e.g. spam vs. not-spam, intent categories), then point the provider at the resulting file.

import { classify } from '@localmode/core';
import { mediapipe } from '@localmode/mediapipe';

// Pass the URL of your Model Maker .tflite file
const model = mediapipe.textClassifier(
  'https://your-cdn.com/models/my-text-classifier.tflite'
);

const { label, score } = await classify({
  model,
  text: 'This product completely changed my workflow!',
});

console.log(`${label}: ${score.toFixed(3)}`);
// ❌ Throws ValidationError — no default text classifier exists
const model = mediapipe.textClassifier();

If you need ready-made, general-purpose text classification (sentiment, topic, zero-shot), use @localmode/transformers instead — it ships pre-trained ONNX classifiers.

Next Steps

On this page