LocalMode
Transformers

Fill-Mask

BERT-style masked token prediction for autocomplete and text suggestions.

Predict masked tokens in text using BERT-style models. Useful for smart autocomplete, word suggestions, and understanding language patterns.

For full API reference (fillMask(), fillMaskMany(), options, result types, and custom providers), see the Core Fill-Mask guide.

See it in action

Try Smart Autocomplete for a working demo.

ModelSizeMask TokenUse Case
onnx-community/ModernBERT-base-ONNX~150MB[MASK]General-purpose, English, modern architecture
Xenova/bert-base-cased~67MB[MASK]Case-sensitive predictions
Xenova/bert-base-multilingual-cased~180MB[MASK]Multilingual
Xenova/roberta-base~125MB<mask>RoBERTa-based (note: uses <mask>, not [MASK])

Smart Autocomplete Example

Based on the Smart Autocomplete showcase app:

import { transformers } from '@localmode/transformers';
import { fillMask } from '@localmode/core';

const model = transformers.fillMask('onnx-community/ModernBERT-base-ONNX');

async function getSuggestions(partialSentence: string) {
  const textWithMask = `${partialSentence} [MASK]`;

  const { predictions } = await fillMask({
    model,
    text: textWithMask,
    topK: 5,
    abortSignal: controller.signal,
  });

  return predictions.map((p) => ({
    word: p.token,
    confidence: p.score,
    fullText: p.sequence,
  }));
}

Best Practices

Fill-Mask Tips

  1. Check mask token — BERT models expect [MASK], but RoBERTa uses <mask>. Always match the token to the model.
  2. One mask at a time — Most models predict one mask per call
  3. Adjust topK — Use 3-10 for autocomplete, 1 for best prediction
  4. ModernBERT for suggestionsModernBERT-base-ONNX provides strong general-purpose predictions

Showcase Apps

AppDescriptionLinks
Smart AutocompletePredict masked words for intelligent text completionDemo · Source

Next Steps

On this page