LocalMode
Core

Agents

Build tool-using AI agents that run entirely in the browser with ReAct reasoning.

Agents

Build AI agents that use tools step-by-step to solve complex tasks — entirely in the browser. The agent framework uses the ReAct pattern (Reason-Act-Observe) with generateObject() for reliable, model-agnostic tool calling.

See it in action

Try Research Agent and LLM Chat for working demos of these APIs.

Provider-agnostic

The agent framework works with any LanguageModel — WebLLM, wllama, or a custom provider. It uses generateObject() with Zod schemas for tool selection, not native function calling.

Quick Start

import { createAgent, runAgent, jsonSchema } from '@localmode/core';
import { webllm } from '@localmode/webllm';
import { z } from 'zod';

// Define tools with Zod schemas
const searchTool = {
  name: 'search',
  description: 'Search a knowledge base for relevant information',
  parameters: jsonSchema(z.object({
    query: z.string().describe('The search query'),
    maxResults: z.number().default(5),
  })),
  execute: async ({ query, maxResults }) => {
    // Your search implementation
    return `Found ${maxResults} results for: ${query}`;
  },
};

// One-shot execution
const result = await runAgent({
  model: webllm.languageModel('Qwen3-1.7B-q4f16_1-MLC'),
  tools: [searchTool],
  prompt: 'What are the benefits of quantum computing?',
  maxSteps: 10,
});

console.log(result.result);       // Final answer
console.log(result.steps.length); // Number of steps taken
console.log(result.finishReason); // 'finish' | 'max_steps' | etc.

How It Works

The agent uses the ReAct (Reason-Act-Observe) loop:

Build prompt — System instructions + tool descriptions + conversation history

Generate action — Call generateObject() with a schema for tool_call or finish

If tool_call — Validate arguments, execute the tool, capture observation

If finish — Return the final answer

Repeat — Add the step to history and go back to step 1

Defining Tools

Tools are defined with a name, description, parameter schema, and execute function:

import { jsonSchema } from '@localmode/core';
import { z } from 'zod';

const calculatorTool = {
  name: 'calculate',
  description: 'Evaluate a mathematical expression',
  parameters: jsonSchema(z.object({
    expression: z.string().describe('Math expression like "2 + 2"'),
  })),
  execute: async ({ expression }, { abortSignal, stepIndex }) => {
    // The context provides AbortSignal and current step index
    abortSignal.throwIfAborted();
    return String(eval(expression));
  },
};

ToolExecutionContext

Every tool receives a context with:

FieldTypeDescription
abortSignalAbortSignalFor cancellation
stepIndexnumberCurrent step number (zero-based)

createAgent()

Create a reusable agent that can be run multiple times:

const agent = createAgent({
  model: webllm.languageModel('Qwen3-1.7B-q4f16_1-MLC'),
  tools: [searchTool, noteTool, calculateTool],
  systemPrompt: 'You are a research assistant. Always search before answering.',
  maxSteps: 10,
  temperature: 0,
});

// Run multiple times — each run is independent
const result1 = await agent.run({ prompt: 'Research quantum computing' });
const result2 = await agent.run({ prompt: 'Research machine learning' });

AgentConfig

OptionTypeDefaultDescription
modelLanguageModelrequiredThe language model for reasoning
toolsToolDefinition[]requiredAvailable tools
systemPromptstringSystem prompt prepended to agent prompt
maxStepsnumber10Maximum ReAct loop iterations
maxDurationMsnumberMaximum total duration (ms)
maxRetriesnumber3Retries per generateObject() call
temperaturenumber0Sampling temperature
memoryAgentMemoryOptional conversation memory
onStep(step) => voidCallback after each step

AgentRunOptions

OptionTypeDescription
promptstringThe user's task/question
abortSignalAbortSignalFor cancellation
onStep(step) => voidPer-run callback (overrides config)
contextstringAdditional context for the prompt

runAgent()

One-shot convenience function — creates and runs an agent in one call:

const result = await runAgent({
  model,
  tools: [searchTool],
  prompt: 'Find info about X',
  maxSteps: 5,
  onStep: (step) => console.log(`Step ${step.index}: ${step.type}`),
});

Safety Guards

The agent enforces multiple safety mechanisms to prevent runaway execution:

Max Steps

const result = await runAgent({ model, tools, prompt, maxSteps: 5 });
// result.finishReason === 'max_steps' if limit reached

Timeout

const result = await runAgent({ model, tools, prompt, maxDurationMs: 30000 });
// result.finishReason === 'timeout' if duration exceeded

Loop Detection

If the model produces the same tool call (same name + identical args) on consecutive steps:

  1. First duplicate: A hint is injected telling the model to try a different approach
  2. Second consecutive duplicate: Agent terminates with finishReason: 'loop_detected'

AbortSignal

const controller = new AbortController();

// Cancel after 10 seconds
setTimeout(() => controller.abort(), 10000);

try {
  const result = await runAgent({
    model, tools, prompt,
    abortSignal: controller.signal,
  });
} catch (error) {
  // AbortError thrown on cancellation
}

AgentResult

Every agent run returns a structured result:

FieldTypeDescription
resultstringFinal answer (empty if terminated by guard)
stepsAgentStep[]All steps executed
finishReasonAgentFinishReasonWhy the agent stopped
totalDurationMsnumberTotal wall-clock time
totalUsageGenerationUsageAccumulated token usage

AgentFinishReason

ValueDescription
'finish'Model provided a final answer
'max_steps'Reached maxSteps limit
'timeout'Exceeded maxDurationMs
'loop_detected'Repeated identical tool calls
'aborted'Cancelled via AbortSignal
'error'Unrecoverable error

AgentStep

FieldTypeDescription
indexnumberZero-based step number
type'tool_call' | 'finish'Step type
toolNamestring?Tool called (tool_call only)
toolArgsRecord?Tool arguments
observationstring?Tool result or error
resultstring?Final answer (finish only)
durationMsnumberStep duration (ms)
usageGenerationUsage?Token usage

Step Callbacks

Monitor agent progress in real-time:

const result = await runAgent({
  model, tools, prompt,
  onStep: (step) => {
    if (step.type === 'tool_call') {
      console.log(`Called ${step.toolName} with`, step.toolArgs);
      console.log(`Result: ${step.observation}`);
    } else {
      console.log(`Final answer: ${step.result}`);
    }
    console.log(`Duration: ${step.durationMs}ms`);
  },
});

Agent Memory

Optional VectorDB-backed conversation memory enables agents to recall past interactions:

import { createAgentMemory, createAgent } from '@localmode/core';
import { transformers } from '@localmode/transformers';

const memory = await createAgentMemory({
  embeddingModel: transformers.embedding('Xenova/bge-small-en-v1.5'),
  maxEntries: 500,
});

const agent = createAgent({ model, tools, memory });

// First run — memory is empty
await agent.run({ prompt: 'What is quantum computing?' });

// Second run — memory contains relevant context from first run
await agent.run({ prompt: 'How does it relate to cryptography?' });

// Cleanup
await memory.close();

How Memory Works

  1. Before the first step: Relevant memories are retrieved using the prompt as query
  2. Injected as context: Retrieved memories appear in the agent prompt
  3. After completion: The user's prompt and final result are stored for future retrieval

Memory is optional — agents work without it. It is useful for multi-turn conversations where context from earlier interactions improves later answers.

AgentMemoryConfig

OptionTypeDefaultDescription
embeddingModelEmbeddingModelrequiredModel for embedding entries
namestring'agent-memory'VectorDB collection name
dimensionsnumber384Embedding dimensions
maxEntriesnumber1000Max entries before eviction

useAgent() React Hook

The useAgent() hook from @localmode/react provides real-time step streaming:

import { useAgent } from '@localmode/react';

function ResearchAgent() {
  const { steps, result, isRunning, error, run, cancel, reset } = useAgent({
    model: webllm.languageModel('Qwen3-1.7B-q4f16_1-MLC'),
    tools: [searchTool, noteTool],
    maxSteps: 10,
  });

  return (
    <div>
      <button onClick={() => run('Research quantum computing')}>
        Start Research
      </button>

      {isRunning && <button onClick={cancel}>Stop</button>}

      {/* Steps update in real-time */}
      {steps.map(step => (
        <div key={step.index}>
          Step {step.index + 1}: {step.type === 'tool_call'
            ? `Called ${step.toolName}`
            : `Finished: ${step.result}`
          }
        </div>
      ))}

      {result && (
        <div>
          <h3>Answer</h3>
          <p>{result.result}</p>
          <p>Completed in {result.totalDurationMs}ms</p>
        </div>
      )}

      {error && <p>Error: {error.message}</p>}

      <button onClick={reset}>Reset</button>
    </div>
  );
}

UseAgentReturn

FieldTypeDescription
stepsAgentStep[]Steps updated in real-time
resultAgentResult | nullFinal result
isRunningbooleanWhether agent is executing
errorError | nullError if failed
run(prompt, context?) => PromiseStart the agent
cancel() => voidAbort current run
reset() => voidClear all state

Error Handling

Tool Errors Become Observations

If a tool throws, the error message becomes the observation for that step. The model can then decide to retry with different arguments or use a different tool:

const unreliableTool = {
  name: 'fetch',
  description: 'Fetch data from URL',
  parameters: jsonSchema(z.object({ url: z.string() })),
  execute: async ({ url }) => {
    throw new Error('Network timeout');
    // This becomes observation: "Error: Network timeout"
    // The model can try a different URL or a different tool
  },
};

AgentError

Thrown when the agent encounters an unrecoverable error (e.g., model cannot produce valid JSON after all retries):

import { AgentError } from '@localmode/core';

try {
  await runAgent({ model, tools, prompt });
} catch (error) {
  if (error instanceof AgentError) {
    console.log('Steps completed:', error.steps.length);
    console.log('Hint:', error.hint);
  }
}
ModelSizeTool Calling QualityUse Case
Qwen3 8B4.4GBExcellent (0.933 F1)Complex multi-tool tasks
Qwen3 1.7B1.1GBGoodSimple tool tasks, demos
Phi 3.5 Mini2.1GBGoodGeneral purpose
Llama 3.2 1B712MBBasicSingle-tool tasks

Model recommendation

For reliable tool calling, use Qwen3 1.7B or larger. Smaller models may struggle with JSON output format and multi-step reasoning.

Tool Registry

For advanced use cases, create a tool registry directly:

import { createToolRegistry } from '@localmode/core';

const registry = createToolRegistry([searchTool, noteTool]);

registry.has('search');    // true
registry.names();          // ['search', 'note']
registry.descriptions();   // [{ name, description, parameters }]

const validated = registry.validate('search', { query: 'test' });
const result = await registry.execute('search', { query: 'test' }, context);

Showcase Apps

AppDescriptionLinks
Research AgentMulti-step ReAct agent with tool useDemo · Source
LLM ChatAgent mode with tool calling in chat interfaceDemo · Source

On this page