LocalMode
React

Vision

Hooks for image captioning, object detection, classification, and segmentation.

Vision Hooks

See it in action

Try Object Detector and Background Remover for working demos of these hooks.

useCaptionImage

Generate a text caption for an image.

import { useCaptionImage } from '@localmode/react';
import { transformers } from '@localmode/transformers';

const model = transformers.imageCaptioner('Xenova/vit-gpt2-image-captioning');

function Demo() {
  const { data, isLoading, execute } = useCaptionImage({ model });
  // execute(imageDataUrl) => data.caption = "A cat sitting on a couch"
}

useDetectObjects

Detect objects with bounding boxes.

import { useDetectObjects } from '@localmode/react';

const { data, execute } = useDetectObjects({ model });
await execute(imageDataUrl);
// data.objects = [{ label: 'person', score: 0.95, box: { x, y, width, height } }]

useClassifyImage

Classify an image into categories.

import { useClassifyImage } from '@localmode/react';

const { data, execute } = useClassifyImage({ model });
await execute(imageDataUrl);
// data.label = 'cat', data.score = 0.97

useSegmentImage

Segment an image into regions with masks.

import { useSegmentImage } from '@localmode/react';

const { data, execute } = useSegmentImage({ model });
await execute(imageDataUrl);
// data.masks = [{ label: 'background', mask: Uint8Array, score: 0.98 }]

useClassifyImageZeroShot

Zero-shot image classification with custom labels (no fine-tuning needed).

import { useClassifyImageZeroShot } from '@localmode/react';

const { data, execute } = useClassifyImageZeroShot({ model });
await execute({ image: imageDataUrl, labels: ['cat', 'dog', 'bird'] });
// data.label = 'cat', data.score = 0.92

useExtractImageFeatures

Extract feature vectors from images for similarity comparison.

import { useExtractImageFeatures } from '@localmode/react';

const { data, execute } = useExtractImageFeatures({ model });
await execute(imageDataUrl);
// data.features = Float32Array(768)

useImageToImage

Image super-resolution or style transfer.

import { useImageToImage } from '@localmode/react';

const { data, execute } = useImageToImage({ model });
await execute(imageDataUrl);
// data.image = 'data:image/png;base64,...' (upscaled/transformed image)

All vision hooks accept image data URLs (from FileReader.readAsDataURL). For model recommendations, see the Transformers guide.

Showcase Apps

AppDescriptionLinks
Object DetectorDetect objects with useDetectObjectsDemo · Source
Background RemoverSegment images with useSegmentImageDemo · Source
Photo EnhancerEnhance images with useImageToImageDemo · Source
Image CaptionerCaption images with useOperationListDemo · Source
Duplicate FinderCompare image features with useSequentialBatchDemo · Source

On this page