Introducing AgentUI: Typed React Components for LLM Agents

May 22, 202610 min read

aiagentsllmreactopen-sourcetypescriptlaunch

Today I'm shipping the 1.0 of AgentUI — an open-source TypeScript library that gives LLM agents a typed event protocol for driving React UIs. Ten packages on npm, a documented stability surface, and three runnable starter apps.

If you've ever tried to build an agentic app and watched the model emit <table> tags into your chat window, this is for you.

The problem

Most AI chat UIs today are text bubbles in a scrollview. That works for a chatbot. It doesn't work for the apps people actually want to build on top of LLMs — support agents that surface a customer's order history, ops copilots that render a dashboard mid-conversation, internal tools that confirm a multi-step workflow before executing it. Real agentic apps need richer output: tables, cards, steppers, charts, file pickers, approval gates. Streams of plain text don't get you there.

The obvious move — "let the model write JSX" — falls apart on contact. Output is inconsistent: the same prompt produces a <div class="card"> one turn and a <section> the next. There's no validation: nothing stops the model from emitting whatever markup it feels like, including markup that breaks your layout or your design system. There's no interactivity loop back to the agent. And the security story is brutal — you're either rendering arbitrary HTML from a language model (don't), or you're trying to write a sandboxed-JSX-parser that you'll regret forever.

The agent shouldn't be writing your UI. It should be telling your UI what to show.

That's a different architecture. The model never touches the DOM. It picks from a whitelist of components you control, fills in their props with structured data validated against a schema, and your frontend renders them. The model gets expressiveness; you keep the design system, the validation boundary, and the audit trail.

The approach

AgentUI is built around two ideas: a typed event protocol that the agent emits over an SSE stream, and a whitelisted component registry that your frontend uses to render those events. There are four event families: ui.* for composing the interface (append, replace, remove, toast, navigate, reset), tool.* for surfacing long-running tool calls with streaming arguments, reasoning.* for extended-thinking traces, and optimistic.* for round-trip-pending state. Plus workflow.* for steppers and action.* for the user's clicks coming back the other direction.

Here's the flow for one round trip:

Agent (LLM)
   │  emit_ui_event (tool call)
   ▼
Server (NestJS / Node / Next.js)
   │  SSE stream (UIEvent)
   ▼
React frontend
   │  registry lookup
   ▼
Your components
   │  user click → ActionEvent
   ▼
back to the agent

The agent doesn't know what data-table looks like. It knows the type "data-table" is in the registry, knows the props shape from the JSON Schema the server advertises during the session handshake, and emits an event that says "append a node of type data-table with these props." Your frontend looks up data-table in the registry, validates the props against the schema, and renders the component you wrote. The model never sees your CSS, never sees your component tree, never writes a tag.

Every event the agent emits is validated server-side before it reaches the browser. The wire is the contract.

This is what makes the system safe to deploy. The registry is a closed set — if the model hallucinates a node type that doesn't exist, the server rejects the event. The props are validated against Zod schemas (or JSON Schemas, for non-TS consumers) before rendering. And because the agent only interacts with the UI through structured events, every change is observable: you can replay a session, time-travel through state in DevTools, or persist the event log for audit.

Composability is the other reason this works. You don't have one giant <AIChat> component — you have per-feature slices. useToolCalls() for the tool-call stream, useWorkflow(id) for the active stepper, useOptimistic(entityKey) for pending state, useReasoning() for extended-thinking trace. Each slice is bounded in memory, each is independently testable, and each is something you'd write anyway if you were building these features from scratch.

Show, don't tell

A minimal app looks like this. You declare a registry of components the agent can compose, then drop <AgentRoot> somewhere near the top of your tree.

import {
  AgentRoot,
  AgentRenderer,
  createRegistry,
} from "@kibadist/agentui-react";
 
const registry = createRegistry({
  "data-table": DataTable,
  "info-card": InfoCard,
  "approval-gate": ApprovalGate,
});
 
export function App() {
  return (
    <AgentRoot endpoint="/api/agent">
      <AgentRenderer registry={registry} />
    </AgentRoot>
  );
}

That's the whole client side. The agent now has three components to choose from, and any event it emits is checked against this registry before rendering. Add a component, add a row to the registry — no agent code changes required.

On the server, if you're on Next.js App Router:

import { createAgentStream } from "@kibadist/agentui-node";
import { fromAnthropic } from "@kibadist/agentui-llm";
 
export async function POST(req: Request) {
  const stream = createAgentStream({
    llm: fromAnthropic({ model: "claude-sonnet-4-6" }),
  });
  return stream.toResponse(req);
}

There's a NestJS variant (@kibadist/agentui-nest), an Express/Fastify/Hono variant (@kibadist/agentui-node plus your framework's response object), and provider-native adapters for Anthropic, OpenAI, and Gemini via @kibadist/agentui-llm.

Here's what it looks like running end to end:

AgentUI demo — an agent composing a table, a card, and a stepper into a chat interface — Agent emitting ui.append events for a data table, an info card, and a workflow stepper. User clicks flow back as action events.

The full live showcase is at github.com/kibadist/agentui — the three starter apps (chat-starter, support-bot, internal-tools) each run standalone with a mock SSE backend so you can clone, pnpm install, and see real events flowing in under five minutes.

What you get in 1.0

The 1.0 line is the first release with a documented stability promise. Everything in STABILITY.md is covered by semver: breaking changes require a major bump, and we ship a migration guide alongside.

What's stable:

The full wire protocol. UIEvent, ToolEvent, ReasoningEvent, OptimisticEvent, WorkflowEvent, ActionEvent, SessionMetaEvent. Their JSON Schemas live in @kibadist/agentui-validate/schema/*.json for non-TS consumers (Python, Go, anyone else).
The React hooks. useAgentStream, useAgentSelector, useAgentNodes, useToolCalls, useReasoning, useOptimistic, useWorkflow, useCapabilities, useAgentSession, useAgentAction. Option types, return shapes, and referential-stability guarantees are part of the contract.
Top-level components. <AgentRoot>, <AgentRenderer>, <WorkflowStepper>, <ToolCallStream>, the providers. Drop them in; they don't change.
Server primitives. createAgentStream, createAgentReadable, Conversation, MemoryConversationStorage, the LLM adapter helpers. Framework-agnostic on the Node side — Express, Fastify, Hono, raw node:http, Next.js Route Handlers all work with the same primitives.
Extension points. Five interfaces are designed for third-party implementations: Registry, ComponentSpec, SessionStorageAdapter, ConversationStorage, and (planned) StreamTransport. If you build on these, your code keeps working across minor and patch releases.

What's explicitly experimental: the DevTools panel UI, the AgentRootRegistry internals for multi-root hosts, and the metrics shape. Pinned to a minor, but no SLA yet — see STABILITY.md for the full list.

Migration from 0.x is a single search-and-replace. The only breaking change is the initialAgentState constant becoming a createInitialAgentState() factory — the migration guide is fifty lines and most of them are examples.

How it compares

There's a small but growing space of "let an LLM drive UI" approaches. AgentUI is opinionated about what differentiates it:

vs. building UI in-prompt (e.g. Vercel AI SDK's experimental streamUI with React server components). The in-prompt approach is elegant but couples your UI to the prompt: every component the agent can render is a function on the server, and the server renders it during the model loop. AgentUI separates them. The agent emits a typed event; your frontend renders it. That makes the registry composable across multiple frontends (a web app and a mobile app can share a backend, render different components for the same event), keeps your bundle out of the server's hot path, and gives you a wire protocol that's debuggable and replayable without standing up the model.

vs. raw function-calling for UI. Function-calling works, and a lot of teams have a half-built version of AgentUI sitting in their codebase already. What you don't get from raw function-calling: schema validation at the wire boundary, a session model with capabilities handshake, optimistic update semantics, a tool-call streaming protocol, replayable event logs, a DevTools panel that works for free. AgentUI is the productized version of what you'd build the third time around.

vs. raw HTML/JSX generation. This is the design we're explicitly rejecting. Letting the model emit markup means you're either sandboxing arbitrary HTML (security boundary you don't want), accepting whatever the model produces (consistency you don't get), or rewriting the model's output through some normalizer (complexity that grows forever). The whitelist-and-schema approach trades some expressiveness for a strict safety boundary, and in practice the expressiveness loss is zero — the model picks from your components, which are the only ones you've designed for anyway.

Safer than raw markup. More composable than in-prompt components. More debuggable than function calls.

That's the pitch in one line.

Why now

Two years ago this library would have been a curiosity. Today the agent stack has consolidated enough that "render structured agent output" is a load-bearing problem for a lot of teams.

LangGraph, Mastra, the OpenAI Agents SDK, Claude's tool use, the broader Anthropic Agents work — they all assume that an agent will run tools, produce intermediate output, and need a way to surface that to a user. The back end of that story is well-trodden now. The front end is mostly DIY: every team ships their own <ToolCallCard> component, their own optimistic-update reducer, their own session-replay logic. That's a lot of duplicated infrastructure, and most of it ends up half-done because it's not the team's actual product.

AgentUI is the front-end half of the agent stack — the side that turns tool calls into a tool-call stream, workflow events into a stepper, reasoning deltas into a collapsible trace. It composes with whatever agent framework you're already using; the LLM adapters are peer dependencies, so you can use Anthropic, OpenAI, Gemini, or write your own. It doesn't make framework choices for you.

The other reason this is shipping now is that the protocol has been load-bearing for an internal app of mine for about eighteen months. Tool calls, reasoning traces, optimistic updates, multi-step workflows, devtools — every feature in 1.0 came out of a real bug, a real performance regression, or a real user request. It's not a sketch.

Get started

pnpm add @kibadist/agentui-react @kibadist/agentui-node @kibadist/agentui-llm

The four packages most apps need are react (frontend), node (framework-agnostic server primitives), llm (provider-native adapters — Anthropic, OpenAI, Gemini), and validate (Zod schemas + JSON Schemas — installed as a dep of the others, also usable directly for non-TS servers). The full matrix is in the packages reference.

For a working app to read or fork:

git clone https://github.com/kibadist/agentui
cd agentui && pnpm install && pnpm build
pnpm --filter @kibadist/agentui-example-chat-starter dev
# open http://localhost:3010

Three starters: chat-starter (minimal Next.js + <AgentRoot>), support-bot (multi-turn agent with tool calls and a reasoning trace), internal-tools (agent embedded as a side panel in a mock CRUD app). All three run standalone with a mock SSE backend — no LLM key required to see events flowing.

Docs: kibadist.github.io/agentui — Getting Started, Concepts, Wire Protocol, fifteen guide pages, and a reference per package.

Repo: github.com/kibadist/agentui — issues and PRs welcome. Protocol-level proposals go through the rfcs/ framework.

If you build something with it, let me know — I want to see what people make.

For the engineering retrospective on getting AgentUI from a 0.3 prototype to a v1.0 OSS package — the subagent pipeline, the spec-then-plan-then-execute workflow, the things I'd do differently — see Shipping AgentUI v1: The Library Upgrade.

Comments

No comments yet. Be the first to comment!