mech.app
Dev Tools

Agent-Native: How Builder.io's Framework Unifies UI and Agent State in One Database

Builder.io's Agent-Native framework keeps UI and agent actions synchronized in real-time using shared SQL state and CRDT merging.

Source: github.com
Agent-Native: How Builder.io's Framework Unifies UI and Agent State in One Database

Most agent frameworks treat the UI as a separate layer. You build a chat interface, the agent runs in the background, and you pipe updates through WebSockets or polling. Builder.io’s Agent-Native framework takes a different path: the agent and the UI share the same database and the same state. When the agent edits a document, the user sees it in real time. When the user selects text and hits Cmd+I, the agent knows exactly what to modify.

This is not a chat wrapper around an application. It is a framework where agents and humans operate as peer editors on the same data structures, using CRDT merging to resolve conflicts and SQL to persist everything.

The Core Architecture

Agent-Native apps run on three primitives:

  • Shared state: One database, one source of truth. Both UI and agent read and write to the same SQL tables.
  • CRDT merging: Simultaneous edits from humans and agents resolve automatically without last-write-wins conflicts.
  • Per-user workspace: Each user gets their own skills, memory, instructions, sub-agents, and MCP servers, all stored in SQL.

The framework is backend-agnostic. It supports any SQL database that Drizzle ORM supports (Postgres, MySQL, SQLite, Turso, Neon, PlanetScale). It supports any hosting target that Nitro supports (Vercel, Cloudflare Workers, AWS Lambda, Node.js, Deno, Bun). You are not locked into a specific runtime or cloud provider.

How CRDT Merging Works

When a human and an agent edit the same document at the same time, Agent-Native uses CRDT (Conflict-free Replicated Data Type) logic to merge changes. Each edit is timestamped and attributed to an actor (human or agent). The framework tracks:

  • Cursors: Where each participant is typing.
  • Selection rings: What text or elements are selected.
  • Presence: Who is on which slide, page, or section.

The agent appears as a first-class peer editor. If you are editing a paragraph and the agent is editing the next one, both changes persist. If you both edit the same sentence, the CRDT algorithm merges the edits based on operation order and causal dependencies, not just timestamps.

This is the same approach used in collaborative editors like Figma and Google Docs, but applied to agent-human collaboration instead of human-human collaboration.

Per-User Workspace Structure

Each user workspace is SQL-backed and includes:

ComponentPurposeStorage
SkillsReusable agent capabilities (e.g., “send email,” “query database”)SQL table with skill definitions and parameters
MemoryContext the agent retains across sessionsSQL table with key-value pairs or vector embeddings
InstructionsUser-specific prompts and behavior rulesSQL table with instruction text and priority
Sub-agentsSpecialized agents the user can invokeSQL table with agent metadata and tool grants
MCP serversModel Context Protocol servers for external integrationsSQL table with connection strings and credential refs

This structure means you can build SaaS-grade multi-tenant agent applications where each user customizes their agent’s behavior, tools, and memory. The agent does not share state across users unless you explicitly grant access.

Agent-to-Agent Protocol (A2A)

Agent-Native supports agent-to-agent communication. If you tag another agent from any app, the framework uses the A2A protocol to discover and invoke it. The protocol works like this:

  1. Discovery: The calling agent queries a registry (SQL table or external service) to find the target agent’s endpoint and capabilities.
  2. Invocation: The calling agent sends a structured request (JSON-RPC or similar) with the task, context, and credentials.
  3. Response: The target agent executes the task and returns structured output, which the calling agent merges into its own state.

This allows you to build multi-agent systems where agents specialize in different domains (e.g., one agent handles email, another handles analytics) and call each other as needed.

Three Deployment Shapes

Agent-Native apps can be deployed in three configurations:

  • Headless API: The agent runs as a REST or GraphQL API. No UI. Other services call it programmatically.
  • Chat experience: The agent runs inside a chat interface. Users interact via natural language. The framework provides a reusable chat shell.
  • Full application: The agent and UI are both present. Users can click buttons or ask the agent to perform the same actions. State stays synchronized.

You can start with one shape and migrate to another without rewriting the agent logic. The same agent code runs in all three modes.

Reusable Integrations

Agent-Native includes a shared integration layer called Dispatch. When you connect a provider (e.g., Gmail, Stripe, Notion), you configure it once in Dispatch. The framework stores:

  • Account metadata: API endpoints, rate limits, supported operations.
  • Credential refs: Pointers to secrets in a vault (not the secrets themselves).

Other apps (Brain, Analytics, Mail, Dispatch itself) can request access to the shared account metadata and credential refs. This eliminates the need to re-authenticate or re-configure the same integration in multiple apps.

Code Example: Shared Action

Here is how you define an action that works from both UI and agent:

import { defineAction } from '@agent-native/core';

export const updateTaskStatus = defineAction({
  name: 'updateTaskStatus',
  schema: z.object({
    taskId: z.string(),
    status: z.enum(['todo', 'in-progress', 'done']),
  }),
  async execute({ taskId, status }, { db, user }) {
    // Update SQL state
    await db.update(tasks)
      .set({ status, updatedAt: new Date() })
      .where(eq(tasks.id, taskId));

    // Broadcast to all connected clients (CRDT sync)
    await broadcast({
      type: 'task.updated',
      taskId,
      status,
      userId: user.id,
    });

    return { success: true };
  },
});

The UI can call this action when the user clicks a button:

<button onClick={() => updateTaskStatus({ taskId: '123', status: 'done' })}>
  Mark Done
</button>

The agent can call the same action when the user says “mark task 123 as done”:

await agent.call('updateTaskStatus', { taskId: '123', status: 'done' });

Both paths execute the same function, update the same database, and broadcast the same CRDT event.

Observability and Failure Modes

Agent-Native apps expose observability hooks for:

  • Action logs: Every action execution is logged with timestamp, user, input, output, and duration.
  • CRDT events: Every state change is logged with the actor (human or agent) and the merge result.
  • Agent traces: Tool calls, LLM requests, and sub-agent invocations are traced end-to-end.

Likely failure modes:

  • CRDT merge conflicts: If two actors make incompatible changes (e.g., delete the same paragraph), the framework logs a conflict event. You must decide how to resolve it (e.g., prefer human edits, show a diff, ask the user).
  • SQL deadlocks: If the agent and UI both try to update the same row simultaneously, the database may deadlock. Use row-level locking or optimistic concurrency control.
  • Agent timeout: If the agent takes too long to respond, the UI should show a loading state and allow the user to cancel. The framework does not enforce a timeout by default.
  • Credential expiry: If a credential ref points to an expired token, the integration will fail. The framework should catch this and prompt the user to re-authenticate.

Security Boundaries

Agent-Native enforces security at three levels:

  1. User isolation: Each user’s workspace is isolated in SQL using row-level security or tenant IDs. Agents cannot access other users’ data unless explicitly granted.
  2. Credential vault: Secrets are stored in a separate vault (e.g., HashiCorp Vault, AWS Secrets Manager). The framework only stores credential refs, not the secrets themselves.
  3. Action permissions: Each action declares which roles can execute it. The framework checks permissions before executing.

You must configure these boundaries yourself. The framework provides the primitives but does not enforce a default security model.

When to Use Agent-Native

Use Agent-Native when:

  • You want agents and humans to collaborate on the same data in real time.
  • You need multi-tenant agent applications with per-user customization.
  • You want to deploy the same agent as an API, a chat interface, or a full application.
  • You need fine-grained control over state, permissions, and integrations.

Avoid Agent-Native when:

  • You are building a simple chatbot that does not need persistent state.
  • You do not want to manage SQL databases or CRDT merging.
  • You need a fully managed agent platform with built-in hosting and observability.

Technical Verdict

Agent-Native solves the state synchronization problem that most agent frameworks ignore. By sharing one database and one state between UI and agent, it eliminates the drift that happens when chat interfaces and application state live in separate systems. The CRDT merging is the key innovation: it allows humans and agents to edit the same document simultaneously without conflicts.

The trade-off is complexity. You must manage SQL schemas, CRDT logic, and security boundaries yourself. If you are building a simple chatbot, this is overkill. If you are building a collaborative application where agents and humans work together on the same data, this is the right foundation.

The framework is backend-agnostic and supports any SQL database and any hosting target. This is a strong signal that the team is building for flexibility, not lock-in.