Core
OCR
Extract text from images using optical character recognition.
Extract text from images using optical character recognition (OCR) models. Works with any provider that implements the OCRModel interface.
See it in action
Try OCR Scanner for a working demo of these APIs.
extractText()
Extract text from a single image:
import { extractText } from '@localmode/core';
import { transformers } from '@localmode/transformers';
const model = transformers.ocr('Xenova/trocr-small-printed');
const { text, usage } = await extractText({
model,
image: imageBlob,
});
console.log(text); // 'Invoice #12345\nDate: 2024-01-15'
console.log(usage.durationMs); // 150const { text, regions } = await extractText({
model,
image: imageBlob,
detectRegions: true,
});
regions?.forEach((region) => {
console.log(`"${region.text}" (confidence: ${region.confidence.toFixed(2)})`);
if (region.bbox) {
console.log(` at (${region.bbox.x}, ${region.bbox.y})`);
}
});const controller = new AbortController();
setTimeout(() => controller.abort(), 5000);
const { text } = await extractText({
model,
image: imageBlob,
abortSignal: controller.signal,
});ExtractTextOptions
Prop
Type
ExtractTextResult
Prop
Type
TextRegion
Prop
Type
extractTextMany()
Extract text from multiple images:
import { extractTextMany } from '@localmode/core';
const { texts } = await extractTextMany({
model,
images: [image1, image2, image3],
});
texts.forEach((t) => console.log(t));Custom Provider
Implement the OCRModel interface:
import type { OCRModel, DoOCROptions, DoOCRResult } from '@localmode/core';
class MyOCR implements OCRModel {
readonly modelId = 'custom:my-ocr';
readonly provider = 'custom';
async doOCR(options: DoOCROptions): Promise<DoOCRResult> {
const { images } = options;
// Your OCR logic here
const results = images.map(() => ({
text: 'extracted text',
regions: [],
}));
return {
results,
usage: { durationMs: 0 },
};
}
}For recommended models, provider-specific options, and practical recipes, see the Transformers OCR guide.