Fill-Mask

Predict masked tokens in text using BERT-style models. Useful for smart autocomplete, word suggestions, and understanding language patterns.

For full API reference (fillMask(), fillMaskMany(), options, result types, and custom providers), see the Core Fill-Mask guide.

See it in action

Try Smart Autocomplete for a working demo.

Recommended Models

Model	Size	Mask Token	Use Case
`onnx-community/ModernBERT-base-ONNX`	~150MB	`[MASK]`	General-purpose, English, modern architecture
`Xenova/bert-base-cased`	~67MB	`[MASK]`	Case-sensitive predictions
`Xenova/bert-base-multilingual-cased`	~180MB	`[MASK]`	Multilingual
`Xenova/roberta-base`	~125MB	`<mask>`	RoBERTa-based (note: uses `<mask>`, not `[MASK]`)

Smart Autocomplete Example

Based on the Smart Autocomplete showcase app:

import { transformers } from '@localmode/transformers';
import { fillMask } from '@localmode/core';

const model = transformers.fillMask('onnx-community/ModernBERT-base-ONNX');

async function getSuggestions(partialSentence: string) {
  const textWithMask = `${partialSentence} [MASK]`;

  const { predictions } = await fillMask({
    model,
    text: textWithMask,
    topK: 5,
    abortSignal: controller.signal,
  });

  return predictions.map((p) => ({
    word: p.token,
    confidence: p.score,
    fullText: p.sequence,
  }));
}

Best Practices

Fill-Mask Tips

Check mask token — BERT models expect [MASK], but RoBERTa uses <mask>. Always match the token to the model.
One mask at a time — Most models predict one mask per call
Adjust topK — Use 3-10 for autocomplete, 1 for best prediction
ModernBERT for suggestions — ModernBERT-base-ONNX provides strong general-purpose predictions