Browser AI Tasks & Capabilities

Learn how to run AI tasks entirely in the browser. Guides for embeddings, classification, translation, summarization, and more.

Articles

Guides & insights

Models

Browser AI models

Tasks

AI capabilities

Comparisons

Tool comparisons

Use Cases

Industry solutions

Compatibility

Browser support

May 15, 2026

Document QA in the Browser

Ask questions about document images - forms, receipts, charts, and reports - using Florence-2 or Donut in the browser.

May 13, 2026

Extractive Question Answering in the Browser

Answer questions by finding the exact answer span in a given context passage - fast, accurate, hallucination-free.

April 23, 2026

Fill-Mask (Masked Language Modeling) in the Browser

Predict missing words in text using ModernBERT - for autocomplete, data augmentation, and text understanding.

May 15, 2026

Image Captioning in the Browser

Generate natural language descriptions of images using Florence 2 - entirely in the browser.

May 11, 2026

Image Classification in the Browser

Classify images into categories using ViT - identify objects, scenes, and content types in the browser.

April 24, 2026

Image Segmentation in the Browser

Classify every pixel in an image into semantic categories - roads, buildings, people, sky - using SegFormer.

April 22, 2026

Image-to-Image (Super Resolution) in the Browser

Upscale and enhance images using transformer-based super resolution - 2x and 4x upscaling in the browser.

April 3, 2026

Machine Translation in the Browser

Translate text between languages using dedicated translation models or Chrome Built-in AI - offline and private.

April 18, 2026

Multimodal Embeddings (CLIP/SigLIP) in the Browser

Embed text and images into a shared vector space - search photos with words, find similar images, cross-modal retrieval.

May 17, 2026

Named Entity Recognition (NER) in the Browser

Extract named entities - people, organizations, locations, and more - from text automatically in the browser.

May 4, 2026

Object Detection in the Browser

Detect and locate objects in images with bounding boxes using D-FINE - just ~4.5MB for 80 object categories.

April 16, 2026

Optical Character Recognition (OCR) in the Browser

Extract text from images using TrOCR, GLM-OCR, and LightOnOCR-2 - process receipts, documents, tables, and formulas in the browser.

May 4, 2026

Search Reranking in the Browser

Re-score and reorder search results using a cross-encoder model for dramatically better retrieval precision.

May 2, 2026

Speech-to-Text in the Browser

Transcribe spoken audio to text in real-time using Moonshine models - entirely offline, entirely private.

May 13, 2026

Text Classification in the Browser

Classify text into categories - sentiment analysis, topic detection, intent recognition, and content moderation.

April 25, 2026

Text Embeddings in the Browser

Convert text into semantic vector representations for similarity search, clustering, and RAG pipelines.

April 28, 2026

Text Generation (LLM Chat) in the Browser

Generate text, answer questions, and have conversations using local LLMs running entirely in the browser.

April 15, 2026

Text Summarization in the Browser

Condense long documents into concise summaries using DistilBART or Chrome Built-in AI.

April 20, 2026

Text-to-Speech in the Browser

Generate natural-sounding speech from text using Kokoro - phonemizer-backed synthesis with 29 English voices, speed control, and streaming playback in the browser.