Core
Document QA
Answer questions about document images and tables.
Answer questions about document images and structured tables using document QA models. These models understand visual document layouts and can extract information from invoices, forms, receipts, and tables.
See it in action
Try Invoice QA for a working demo of these APIs.
askDocument()
Answer a question about a document image:
import { askDocument } from '@localmode/core';
import { transformers } from '@localmode/transformers';
const model = transformers.documentQA('onnx-community/Florence-2-base-ft');
const { answer, score } = await askDocument({
model,
document: invoiceImage,
question: 'What is the total amount?',
});
console.log(answer); // '$1,234.56'
console.log(score); // 0.95const controller = new AbortController();
setTimeout(() => controller.abort(), 10000);
const { answer } = await askDocument({
model,
document: invoiceImage,
question: 'What is the total amount?',
abortSignal: controller.signal,
});AskDocumentOptions
Prop
Type
AskDocumentResult
Prop
Type
askTable()
Answer a question about structured table data:
import { askTable } from '@localmode/core';
const { answer, cells, aggregator } = await askTable({
model: tableQAModel,
table: {
headers: ['Product', 'Price', 'Quantity'],
rows: [
['Widget A', '$10', '100'],
['Widget B', '$25', '50'],
['Widget C', '$15', '75'],
],
},
question: 'What is the most expensive product?',
});
console.log(answer); // 'Widget B'
console.log(cells); // ['Widget B']
console.log(aggregator); // 'NONE'AskTableOptions
Prop
Type
AskTableResult
Prop
Type
Custom Provider
Implement the DocumentQAModel or TableQAModel interface:
import type { DocumentQAModel, DoAskDocumentOptions, DoAskDocumentResult } from '@localmode/core';
class MyDocumentQA implements DocumentQAModel {
readonly modelId = 'custom:my-doc-qa';
readonly provider = 'custom';
async doAskDocument(options: DoAskDocumentOptions): Promise<DoAskDocumentResult> {
const { document, question } = options;
// Your document QA logic here
return {
answer: 'extracted answer',
score: 0.9,
usage: { durationMs: 0 },
};
}
}import type { TableQAModel, DoAskTableOptions, DoAskTableResult } from '@localmode/core';
class MyTableQA implements TableQAModel {
readonly modelId = 'custom:my-table-qa';
readonly provider = 'custom';
async doAskTable(options: DoAskTableOptions): Promise<DoAskTableResult> {
const { table, question } = options;
// Your table QA logic here
return {
answer: 'computed answer',
cells: ['cell1'],
aggregator: 'NONE',
score: 0.85,
usage: { durationMs: 0 },
};
}
}For recommended models, provider-specific options, and practical recipes, see the Transformers Document QA guide.
Next Steps
Transformers Document QA
Recommended document QA models and usage.
OCR
Extract text from images.
Question Answering
Extractive QA from text context.