LocalMode
Transformers

Image Classification

Classify images into categories using Vision Transformer models.

Classify images into predefined categories using Vision Transformer (ViT) models. The model returns the top predicted labels with confidence scores.

For full API reference (classifyImage(), options, result types, and custom providers), see the Core Vision guide.

See it in action

Try Smart Gallery for a working demo.

ModelSizeCategoriesUse Case
Xenova/vit-base-patch16-224~86MB1000 ImageNet classesGeneral image classification

ImageNet Classes

ViT models trained on ImageNet classify into 1000 categories including animals, vehicles, food, and everyday objects. For classifying into custom categories, use Zero-Shot Image Classification with CLIP.

Showcase Apps

AppDescriptionLinks
Smart GalleryAuto-classify gallery photos by content typeDemo · Source

Next Steps

On this page