Document QA

Answer questions about document images and structured tables using document QA models. These models understand visual document layouts and can extract information from invoices, forms, receipts, and tables.

See it in action

Try Invoice QA for a working demo of these APIs.

askDocument()

Answer a question about a document image:

import { askDocument } from '@localmode/core';
import { transformers } from '@localmode/transformers';

const model = transformers.documentQA('onnx-community/Florence-2-base-ft');

const { answer, score } = await askDocument({
  model,
  document: invoiceImage,
  question: 'What is the total amount?',
});

console.log(answer); // '$1,234.56'
console.log(score);  // 0.95

const controller = new AbortController();

setTimeout(() => controller.abort(), 10000);

const { answer } = await askDocument({
  model,
  document: invoiceImage,
  question: 'What is the total amount?',
  abortSignal: controller.signal,
});

AskDocumentOptions

Prop

Type

AskDocumentResult

Prop

Type

askTable()

Answer a question about structured table data:

import { askTable } from '@localmode/core';

const { answer, cells, aggregator } = await askTable({
  model: tableQAModel,
  table: {
    headers: ['Product', 'Price', 'Quantity'],
    rows: [
      ['Widget A', '$10', '100'],
      ['Widget B', '$25', '50'],
      ['Widget C', '$15', '75'],
    ],
  },
  question: 'What is the most expensive product?',
});

console.log(answer);     // 'Widget B'
console.log(cells);      // ['Widget B']
console.log(aggregator); // 'NONE'

AskTableOptions

Prop

Type

AskTableResult

Prop

Type

Custom Provider

Implement the DocumentQAModel or TableQAModel interface:

import type { DocumentQAModel, DoAskDocumentOptions, DoAskDocumentResult } from '@localmode/core';

class MyDocumentQA implements DocumentQAModel {
  readonly modelId = 'custom:my-doc-qa';
  readonly provider = 'custom';

  async doAskDocument(options: DoAskDocumentOptions): Promise<DoAskDocumentResult> {
    const { document, question } = options;

    // Your document QA logic here
    return {
      answer: 'extracted answer',
      score: 0.9,
      usage: { durationMs: 0 },
    };
  }
}

import type { TableQAModel, DoAskTableOptions, DoAskTableResult } from '@localmode/core';

class MyTableQA implements TableQAModel {
  readonly modelId = 'custom:my-table-qa';
  readonly provider = 'custom';

  async doAskTable(options: DoAskTableOptions): Promise<DoAskTableResult> {
    const { table, question } = options;

    // Your table QA logic here
    return {
      answer: 'computed answer',
      cells: ['cell1'],
      aggregator: 'NONE',
      score: 0.85,
      usage: { durationMs: 0 },
    };
  }
}

For recommended models, provider-specific options, and practical recipes, see the Transformers Document QA guide.

Next Steps

Transformers Document QA

Recommended document QA models and usage.

OCR

Extract text from images.

Question Answering

Extractive QA from text context.

Showcase Apps

App	Description	Links
Invoice QA	Ask questions about document images	Demo · Source

Document QA

Transformers Document QA

OCR

Question Answering

On this page