Build tool-using AI agents that run entirely in the browser with ReAct reasoning.

Agents

Build AI agents that use tools step-by-step to solve complex tasks — entirely in the browser. The agent framework uses the ReAct pattern (Reason-Act-Observe) with generateObject() for reliable, model-agnostic tool calling.

See it in action

Try Research Agent and LLM Chat for working demos of these APIs.

Provider-agnostic

The agent framework works with any LanguageModel — WebLLM, wllama, or a custom provider. It uses generateObject() with Zod schemas for tool selection, not native function calling.

Quick Start

import { createAgent, runAgent, jsonSchema } from '@localmode/core';
import { webllm } from '@localmode/webllm';
import { z } from 'zod';

// Define tools with Zod schemas
const searchTool = {
  name: 'search',
  description: 'Search a knowledge base for relevant information',
  parameters: jsonSchema(z.object({
    query: z.string().describe('The search query'),
    maxResults: z.number().default(5),
  })),
  execute: async ({ query, maxResults }) => {
    // Your search implementation
    return `Found ${maxResults} results for: ${query}`;
  },
};

// One-shot execution
const result = await runAgent({
  model: webllm.languageModel('Qwen3-1.7B-q4f16_1-MLC'),
  tools: [searchTool],
  prompt: 'What are the benefits of quantum computing?',
  maxSteps: 10,
});

console.log(result.result);       // Final answer
console.log(result.steps.length); // Number of steps taken
console.log(result.finishReason); // 'finish' | 'max_steps' | etc.

How It Works

The agent uses the ReAct (Reason-Act-Observe) loop:

Build prompt — System instructions + tool descriptions + conversation history

Generate action — Call generateObject() with a schema for tool_call or finish

If tool_call — Validate arguments, execute the tool, capture observation

If finish — Return the final answer

Repeat — Add the step to history and go back to step 1

Defining Tools

Tools are defined with a name, description, parameter schema, and execute function:

import { jsonSchema } from '@localmode/core';
import { z } from 'zod';

const calculatorTool = {
  name: 'calculate',
  description: 'Evaluate a mathematical expression',
  parameters: jsonSchema(z.object({
    expression: z.string().describe('Math expression like "2 + 2"'),
  })),
  execute: async ({ expression }, { abortSignal, stepIndex }) => {
    // The context provides AbortSignal and current step index
    abortSignal.throwIfAborted();
    return String(eval(expression));
  },
};

ToolExecutionContext

Every tool receives a context with:

Field	Type	Description
`abortSignal`	`AbortSignal`	For cancellation
`stepIndex`	`number`	Current step number (zero-based)

createAgent()

Create a reusable agent that can be run multiple times:

const agent = createAgent({
  model: webllm.languageModel('Qwen3-1.7B-q4f16_1-MLC'),
  tools: [searchTool, noteTool, calculateTool],
  systemPrompt: 'You are a research assistant. Always search before answering.',
  maxSteps: 10,
  temperature: 0,
});

// Run multiple times — each run is independent
const result1 = await agent.run({ prompt: 'Research quantum computing' });
const result2 = await agent.run({ prompt: 'Research machine learning' });

AgentConfig

Option	Type	Default	Description
`model`	`LanguageModel`	required	The language model for reasoning
`tools`	`ToolDefinition[]`	required	Available tools
`systemPrompt`	`string`	—	System prompt prepended to agent prompt
`maxSteps`	`number`	`10`	Maximum ReAct loop iterations
`maxDurationMs`	`number`	—	Maximum total duration (ms)
`maxRetries`	`number`	`3`	Retries per `generateObject()` call
`temperature`	`number`	`0`	Sampling temperature
`memory`	`AgentMemory`	—	Optional conversation memory
`onStep`	`(step) => void`	—	Callback after each step

AgentRunOptions

Option	Type	Description
`prompt`	`string`	The user's task/question
`abortSignal`	`AbortSignal`	For cancellation
`onStep`	`(step) => void`	Per-run callback (overrides config)
`context`	`string`	Additional context for the prompt

runAgent()

One-shot convenience function — creates and runs an agent in one call:

const result = await runAgent({
  model,
  tools: [searchTool],
  prompt: 'Find info about X',
  maxSteps: 5,
  onStep: (step) => console.log(`Step ${step.index}: ${step.type}`),
});

Safety Guards

The agent enforces multiple safety mechanisms to prevent runaway execution:

Max Steps

const result = await runAgent({ model, tools, prompt, maxSteps: 5 });
// result.finishReason === 'max_steps' if limit reached

Timeout

const result = await runAgent({ model, tools, prompt, maxDurationMs: 30000 });
// result.finishReason === 'timeout' if duration exceeded

Loop Detection

If the model produces the same tool call (same name + identical args) on consecutive steps:

First duplicate: A hint is injected telling the model to try a different approach
Second consecutive duplicate: Agent terminates with finishReason: 'loop_detected'

AbortSignal

const controller = new AbortController();

// Cancel after 10 seconds
setTimeout(() => controller.abort(), 10000);

try {
  const result = await runAgent({
    model, tools, prompt,
    abortSignal: controller.signal,
  });
} catch (error) {
  // AbortError thrown on cancellation
}

AgentResult

Every agent run returns a structured result:

Field	Type	Description
`result`	`string`	Final answer (empty if terminated by guard)
`steps`	`AgentStep[]`	All steps executed
`finishReason`	`AgentFinishReason`	Why the agent stopped
`totalDurationMs`	`number`	Total wall-clock time
`totalUsage`	`GenerationUsage`	Accumulated token usage

AgentFinishReason

Value	Description
`'finish'`	Model provided a final answer
`'max_steps'`	Reached `maxSteps` limit
`'timeout'`	Exceeded `maxDurationMs`
`'loop_detected'`	Repeated identical tool calls
`'aborted'`	Cancelled via AbortSignal
`'error'`	Unrecoverable error

AgentStep

Field	Type	Description
`index`	`number`	Zero-based step number
`type`	`'tool_call' \| 'finish'`	Step type
`toolName`	`string?`	Tool called (tool_call only)
`toolArgs`	`Record?`	Tool arguments
`observation`	`string?`	Tool result or error
`result`	`string?`	Final answer (finish only)
`durationMs`	`number`	Step duration (ms)
`usage`	`GenerationUsage?`	Token usage

Step Callbacks

Monitor agent progress in real-time:

const result = await runAgent({
  model, tools, prompt,
  onStep: (step) => {
    if (step.type === 'tool_call') {
      console.log(`Called ${step.toolName} with`, step.toolArgs);
      console.log(`Result: ${step.observation}`);
    } else {
      console.log(`Final answer: ${step.result}`);
    }
    console.log(`Duration: ${step.durationMs}ms`);
  },
});

Agent Memory

Optional VectorDB-backed conversation memory enables agents to recall past interactions:

import { createAgentMemory, createAgent } from '@localmode/core';
import { transformers } from '@localmode/transformers';

const memory = await createAgentMemory({
  embeddingModel: transformers.embedding('Xenova/bge-small-en-v1.5'),
  maxEntries: 500,
});

const agent = createAgent({ model, tools, memory });

// First run — memory is empty
await agent.run({ prompt: 'What is quantum computing?' });

// Second run — memory contains relevant context from first run
await agent.run({ prompt: 'How does it relate to cryptography?' });

// Cleanup
await memory.close();

How Memory Works

Before the first step: Relevant memories are retrieved using the prompt as query
Injected as context: Retrieved memories appear in the agent prompt
After completion: The user's prompt and final result are stored for future retrieval

Memory is optional — agents work without it. It is useful for multi-turn conversations where context from earlier interactions improves later answers.

AgentMemoryConfig

Option	Type	Default	Description
`embeddingModel`	`EmbeddingModel`	required	Model for embedding entries
`name`	`string`	`'agent-memory'`	VectorDB collection name
`dimensions`	`number`	`384`	Embedding dimensions
`maxEntries`	`number`	`1000`	Max entries before eviction

useAgent() React Hook

The useAgent() hook from @localmode/react provides real-time step streaming:

import { useAgent } from '@localmode/react';

function ResearchAgent() {
  const { steps, result, isRunning, error, run, cancel, reset } = useAgent({
    model: webllm.languageModel('Qwen3-1.7B-q4f16_1-MLC'),
    tools: [searchTool, noteTool],
    maxSteps: 10,
  });

  return (
    <div>
      <button onClick={() => run('Research quantum computing')}>
        Start Research
      </button>

      {isRunning && <button onClick={cancel}>Stop</button>}

      {/* Steps update in real-time */}
      {steps.map(step => (
        <div key={step.index}>
          Step {step.index + 1}: {step.type === 'tool_call'
            ? `Called ${step.toolName}`
            : `Finished: ${step.result}`
          }
        </div>
      ))}

      {result && (
        <div>
          <h3>Answer</h3>
          <p>{result.result}</p>
          <p>Completed in {result.totalDurationMs}ms</p>
        </div>
      )}

      {error && <p>Error: {error.message}</p>}

      <button onClick={reset}>Reset</button>
    </div>
  );
}

UseAgentReturn

Field	Type	Description
`steps`	`AgentStep[]`	Steps updated in real-time
`result`	`AgentResult \| null`	Final result
`isRunning`	`boolean`	Whether agent is executing
`error`	`Error \| null`	Error if failed
`run`	`(prompt, context?) => Promise`	Start the agent
`cancel`	`() => void`	Abort current run
`reset`	`() => void`	Clear all state

Error Handling

Tool Errors Become Observations

If a tool throws, the error message becomes the observation for that step. The model can then decide to retry with different arguments or use a different tool:

const unreliableTool = {
  name: 'fetch',
  description: 'Fetch data from URL',
  parameters: jsonSchema(z.object({ url: z.string() })),
  execute: async ({ url }) => {
    throw new Error('Network timeout');
    // This becomes observation: "Error: Network timeout"
    // The model can try a different URL or a different tool
  },
};

AgentError

Thrown when the agent encounters an unrecoverable error (e.g., model cannot produce valid JSON after all retries):

import { AgentError } from '@localmode/core';

try {
  await runAgent({ model, tools, prompt });
} catch (error) {
  if (error instanceof AgentError) {
    console.log('Steps completed:', error.steps.length);
    console.log('Hint:', error.hint);
  }
}

Recommended Models

Model	Size	Tool Calling Quality	Use Case
Qwen3 8B	4.4GB	Excellent (0.933 F1)	Complex multi-tool tasks
Qwen3 1.7B	1.1GB	Good	Simple tool tasks, demos
Phi 3.5 Mini	2.1GB	Good	General purpose
Llama 3.2 1B	712MB	Basic	Single-tool tasks

Model recommendation

For reliable tool calling, use Qwen3 1.7B or larger. Smaller models may struggle with JSON output format and multi-step reasoning.

Tool Registry

For advanced use cases, create a tool registry directly:

import { createToolRegistry } from '@localmode/core';

const registry = createToolRegistry([searchTool, noteTool]);

registry.has('search');    // true
registry.names();          // ['search', 'note']
registry.descriptions();   // [{ name, description, parameters }]

const validated = registry.validate('search', { query: 'test' });
const result = await registry.execute('search', { query: 'test' }, context);

Showcase Apps

App	Description	Links
Research Agent	Multi-step ReAct agent with tool use	Demo · Source
LLM Chat	Agent mode with tool calling in chat interface	Demo · Source

Agents

On this page