AI Harness is a serious starter for TypeScript apps that need tool-using LLMs. It gives you provider abstraction, streaming chat, tool calling, MCP integration, sandboxed file output, observability, continuation for long agent runs, and a clean migration path toward a native Tauri app.
The point is to remove the repetitive LLM wiring every app ends up rebuilding: providers, streaming transport, tool schemas, prompt assembly, telemetry, preview surfaces, and guardrails around what the model is allowed to touch.
getModel() returns a configured AI SDK v6 model for Anthropic, OpenAI, Google, Ollama, or LM Studio. Swap local and cloud without rewriting your app.
Dev mode ships with shell, file read, file write, web search, and ingestion. App tools start empty so you can expose only the project-specific actions you actually want.
MCPHost connects to multiple MCP servers, merges their tools, and hands one combined surface to the orchestration layer.
Agent file output is automatically rewritten into development/. The model can build artifacts freely without editing the harness source tree.
A Next.js App Router endpoint already handles model selection, prompt construction, tool selection, telemetry, and multi-step agent loops.
Each turn gets a 50-step tool-call budget. When the budget is exhausted cleanly, the UI shows a Continue button so work can resume across turns.
Langfuse traces token counts, latency, and tool chains. If Langfuse is absent, the same telemetry interface falls back to structured console logging.
buildSystemPrompt() accepts application state so the model works from current records and page context instead of stale chat history alone.
The harness is opinionated about the boring parts that matter in practice: long-running tool loops, visible tool activity, immediate preview, and keeping the chat UI stable while the model creates files.
The UI uses a persistent sidebar layout. The conversation stays on the left while generated output appears in a separate preview pane.
When the model writes HTML into development/, the preview pane auto-opens in a sandboxed iframe with reload and close controls.
Each invocation renders inline with its name, state, and expandable arguments/result. You can see what happened without digging through logs.
Continuation support means big tasks can span turns without losing state or forcing you to re-prompt from scratch.
The baseline setup is intentionally small: one provider key and the app runs. Everything else turns on progressively through env vars or local services.
If you have one API key, you have a working app. Search, ingestion, telemetry, and local models are additive features rather than architectural rewrites.
npm run dev.| Capability | What enables it |
|---|---|
| Anthropic | ANTHROPIC_API_KEY |
| OpenAI | OPENAI_API_KEY |
| Google Gemini | GOOGLE_API_KEY |
| Ollama | running locally at localhost:11434 |
| LM Studio | running locally at localhost:1234 |
| Web search | TAVILY_API_KEY or BRAVE_SEARCH_API_KEY |
| Web ingestion | pip install crawl4ai |
| Observability | LANGFUSE_SECRET_KEY + LANGFUSE_PUBLIC_KEY |
The main idea is simple: keep the reusable LLM layer concentrated in src/ai/,
and keep the UI/web framework thin. That makes it easier to evolve the product without rewriting the foundation.
src/ai/ # reusable harness core provider.ts # getModel() types.ts # config + shared types system-prompt.ts # prompt builder + live data telemetry.ts # Langfuse / console fallback mcp.ts # multi-server MCP host tools/ index.ts # role-based registry protected-paths.ts # sandbox rewrite rules shell.ts # privileged file-read.ts # privileged, secrets blocked file-write.ts # privileged, sandboxed web-search.ts # Tavily / Brave web-ingest.ts # Crawl4AI src/app/ # thin Next.js adapter api/chat/route.ts # streamText orchestration api/preview/route.ts # preview server api/providers/route.ts # available models page.tsx # sidebar + preview layout development/ # all generated file output
This is not hand-wavy "agent safety" copy. The harness already ships with concrete controls around file writes, secret access, execution boundaries, and known provider quirks.
Things the starter already does today in the Next.js version.
development/..env and .env.local are blocked from tool reads.execFile() with args arrays, not stringly shell composition.convertToModelMessages() so thought signatures survive round trips.The codebase is intentionally staged for a stronger boundary later.
The starter stays useful because the core abstractions are compact. These examples cover most of what you extend first.
// switch providers with one call import { getModel } from '@/ai/provider'; const claude = getModel('anthropic'); const gpt = getModel('openai'); const gemini = getModel('google'); const ollama = getModel('ollama'); // override the default model when needed const sonnet = getModel({ provider: 'anthropic', model: 'claude-sonnet-4-6-20250514' });
// streaming route with tool selection + telemetry import { streamText, convertToModelMessages } from 'ai'; import { getModel } from '@/ai/provider'; import { getToolsForRole } from '@/ai/tools'; import { buildSystemPrompt } from '@/ai/system-prompt'; const result = streamText({ model: getModel('anthropic'), system: buildSystemPrompt({ role: 'dev' }), messages: await convertToModelMessages(messages), tools: getToolsForRole('dev'), maxSteps: 50, experimental_telemetry: telemetry, }); return result.toUIMessageStreamResponse();
// merge tools from multiple MCP servers import { MCPHost } from '@/ai/mcp'; const host = new MCPHost(); await host.connect('http://localhost:3001/sse', 'server-a'); await host.connect('http://localhost:3002/sse', 'server-b'); const mcpTools = host.getTools(); const result = streamText({ model: getModel('openai'), tools: { ...appTools, ...mcpTools }, }); await host.close();
// add your own tool with a typed schema import { tool } from 'ai'; import { z } from 'zod'; export const weatherTool = tool({ description: 'Get current weather for a city', inputSchema: z.object({ city: z.string().describe('City name'), }), execute: async ({ city }) => { const res = await fetch(`https://api.weather.example?q=${city}`); return res.json(); }, }); // then register it in src/ai/tools/index.ts
The repository is already staged around a Tauri migration rather than pretending the web version is the end state. That matters if you want tighter trust boundaries, native secrets handling, and a desktop shell later.
Tool-call rendering, Gemini thought-signature safety, MCP host management, and trust-boundary documentation are already in place.
Rust workspace, secret commands, SQLite migrations, Tauri-aware utilities, and static-export production builds are wired.
Sandboxed writes, sidebar + preview UI, continuation, stop controls, secret blocking, and safer web ingestion are complete.
Shell, file, and MCP execution move into Rust so the privileged backend and the UI process are cleanly separated.
If you're building agentic product features in TypeScript, this gets you past the repetitive setup work and into the part that actually differentiates your app.