NemoClaw: How NVIDIA Built a Sandboxed Runtime for Always-On AI Agents

NVIDIA released NemoClaw as a reference stack for running persistent AI agents inside hardened sandboxes. The project addresses the gap between agent demos and production deployments by wrapping OpenClaw and Hermes agents in NVIDIA OpenShell containers with managed inference routing, network policy enforcement, and blueprint-based lifecycle control.

NemoClaw is not an agent framework. It is infrastructure plumbing that sits between your host system and the agent runtime, providing isolation, state persistence, and a single CLI for onboarding, starting, stopping, and upgrading agents without rewriting code.

What Problem NemoClaw Solves

Most agent frameworks assume you run them in a trusted environment. They call APIs, execute shell commands, read files, and modify state with the same privileges as the user who launched them. This works for demos but breaks down when you want to leave an agent running overnight, hand it credentials, or deploy it in a regulated environment.

NemoClaw provides:

Sandbox isolation via OpenShell containers with restricted syscalls and filesystem access
Routed inference so agents cannot bypass your model gateway or call arbitrary endpoints
Network policy to limit egress and prevent data exfiltration
Blueprint lifecycle to manage agent state, configuration, and upgrades without manual container wrangling
Host-side state persistence so agents survive restarts without losing context

The stack supports OpenClaw (default) and Hermes agents out of the box. You switch between them by setting an environment variable before install or using the nemohermes alias.

Architecture: Blueprints, Plugins, and Routing Layers

NemoClaw uses a three-layer architecture:

Host-side CLI manages blueprints, provisions sandboxes, and persists state outside the container
OpenShell sandbox runs the agent with restricted capabilities and enforced network policy
Plugin layer inside the sandbox handles agent-specific setup, inference routing, and tool execution

Blueprint Lifecycle

A blueprint is a declarative specification for an agent deployment. It defines:

Which agent binary to run (OpenClaw or Hermes)
Inference endpoint configuration (local model, API gateway, or routed proxy)
Network policy rules (allowed egress domains, blocked ports)
Filesystem mounts (read-only config, writable state directory)
Resource limits (CPU, memory, GPU allocation)

When you run nemoclaw start, the CLI:

Reads the blueprint from ~/.nemoclaw/blueprints/<agent-name>.yaml
Provisions an OpenShell sandbox with the specified constraints
Mounts host-side state into /workspace/state inside the container
Injects environment variables for inference routing and API keys
Starts the agent process and attaches logging

When you run nemoclaw stop, the CLI:

Sends SIGTERM to the agent process
Waits for graceful shutdown (default 30 seconds)
Persists any in-memory state to the host-side state directory
Tears down the sandbox but keeps the blueprint and logs

This differs from standard container orchestration because the blueprint is versioned and the CLI handles upgrades by diffing the old and new blueprint, migrating state, and restarting the sandbox with minimal downtime.

Routed Inference

Routed inference means the agent does not call model APIs directly. Instead, it sends requests to a local proxy running inside the sandbox. The proxy:

Validates the request against a schema (model name, max tokens, temperature)
Checks the blueprint’s allowed model list
Routes the request to the configured backend (NVIDIA NIM, OpenAI, local vLLM)
Logs the request and response for observability
Returns the result to the agent

This gives you a single point of control for model access. You can swap backends, enforce rate limits, or inject prompt guards without changing agent code.

// Inside the NemoClaw plugin (simplified)
export class InferenceRouter {
  constructor(private config: BlueprintConfig) {}

  async route(request: InferenceRequest): Promise<InferenceResponse> {
    if (!this.config.allowedModels.includes(request.model)) {
      throw new Error(`Model ${request.model} not in blueprint allowlist`);
    }

    const backend = this.selectBackend(request.model);
    const response = await backend.complete(request);

    await this.logRequest(request, response);
    return response;
  }

  private selectBackend(model: string): InferenceBackend {
    // Route based on model name prefix or explicit mapping
    if (model.startsWith("nim/")) {
      return this.nimBackend;
    }
    return this.defaultBackend;
  }
}

The router runs inside the sandbox, so even if the agent is compromised, it cannot bypass the routing layer or call external APIs directly.

Network Policy Enforcement

NemoClaw uses OpenShell’s network namespace isolation to enforce egress rules. The blueprint specifies:

Allowed domains (e.g., api.openai.com, github.com)
Blocked ports (e.g., no outbound SMTP on port 25)
DNS resolution policy (use host resolver or sandbox-local DNS)

The OpenShell runtime translates these rules into iptables or nftables rules inside the sandbox. The agent sees a normal network stack but cannot reach disallowed destinations.

Example blueprint snippet:

network:
  egress:
    allowedDomains:
      - api.openai.com
      - github.com
    blockedPorts:
      - 25  # SMTP
      - 3389  # RDP
  dns:
    resolver: host

This does not prevent all exfiltration (an agent could encode data in allowed HTTP requests), but it raises the bar and makes accidental leaks harder.

State Management and Persistence

NemoClaw persists agent state in three places:

State Type	Location	Lifecycle
Blueprint config	`~/.nemoclaw/blueprints/`	Survives sandbox restarts, manually versioned
Agent working state	`~/.nemoclaw/state/<agent-name>/`	Mounted into sandbox, survives restarts
Logs and telemetry	`~/.nemoclaw/logs/<agent-name>/`	Rotated daily, retained for 7 days by default

The agent writes to /workspace/state inside the sandbox. This directory is bind-mounted from the host, so writes persist even if the sandbox crashes. The CLI handles state migration during blueprint upgrades by:

Stopping the old sandbox
Running a migration script defined in the new blueprint
Starting the new sandbox with the migrated state

This is similar to database migrations but for agent memory and context.

Multi-Agent Support Without Code Changes

NemoClaw supports OpenClaw and Hermes agents through a plugin system. Each plugin implements:

setup(): Install dependencies, configure environment
start(): Launch the agent process
stop(): Graceful shutdown
healthCheck(): Readiness probe

The CLI loads the plugin based on the NEMOCLAW_AGENT environment variable or the blueprint’s agent.type field. You can add a new agent by:

Writing a plugin that implements the interface
Adding the plugin to ~/.nemoclaw/plugins/
Creating a blueprint that references the plugin

No changes to the CLI or core runtime are required. This abstraction lets infrastructure teams standardize on NemoClaw while letting agent developers iterate independently.

Failure Modes and Attack Surface

NemoClaw reduces but does not eliminate agent risk. Known failure modes:

Prompt injection: The inference router does not inspect prompt content, so an agent can still be tricked into leaking data through allowed API calls
Resource exhaustion: An agent can consume all allocated CPU or memory, requiring manual intervention
State corruption: If an agent writes malformed data to /workspace/state, the next restart may fail
Blueprint misconfiguration: Overly permissive network rules or model allowlists weaken isolation

The remaining attack surface includes:

The OpenShell sandbox itself (kernel exploits, container escapes)
The inference router (bugs in request validation or backend selection)
Host-side state directory (if an attacker gains write access, they can inject malicious state)

NemoClaw assumes you trust the agent code enough to run it in a sandbox. It does not provide defense-in-depth against a fully adversarial agent.

Deployment Shape

NemoClaw runs on a single host. It is not a distributed system. Typical deployment:

One NemoClaw instance per physical or virtual machine
One agent per NemoClaw instance (multi-agent support is experimental)
Inference backend can be local (vLLM, Ollama) or remote (NVIDIA NIM, OpenAI)
State and logs stored on local disk or network-attached storage

For multi-host deployments, you run multiple NemoClaw instances and coordinate them with an external orchestrator (Kubernetes, Nomad, or a custom control plane). NemoClaw does not provide clustering or leader election.

Observability Hooks

NemoClaw logs to stdout and writes structured logs to ~/.nemoclaw/logs/. Each log entry includes:

Timestamp
Agent name
Event type (start, stop, inference request, error)
Request ID (for tracing inference calls)

You can forward logs to an external system (Loki, Elasticsearch, CloudWatch) by tailing the log files or configuring a sidecar.

The CLI also exposes a nemoclaw status command that shows:

Agent uptime
Last inference request timestamp
Resource usage (CPU, memory, GPU)
Health check status

There is no built-in metrics endpoint, but you can scrape resource usage from the OpenShell runtime or instrument the plugin layer.

When to Use NemoClaw vs. OpenShell Alone

Scenario	Use NemoClaw	Use OpenShell Alone
Running a supported agent (OpenClaw, Hermes)	Yes	No
Need managed inference routing	Yes	No
Want blueprint-based lifecycle management	Yes	No
Running a custom agent with unique setup	Maybe (write a plugin)	Yes
Need multi-host orchestration	No (use Kubernetes + OpenShell)	Yes
Want maximum control over sandbox config	No	Yes

NemoClaw is a convenience layer. If you need fine-grained control or are running an unsupported agent, use OpenShell directly and build your own lifecycle tooling.

Technical Verdict

Use NemoClaw when:

You are deploying OpenClaw or Hermes in production and need a reference stack for isolation and lifecycle management
You want to enforce network policy and inference routing without modifying agent code
You need state persistence and blueprint-based upgrades
You are running agents on a single host or a small number of hosts

Avoid NemoClaw when:

You need multi-host orchestration (use Kubernetes with OpenShell containers)
You are running a custom agent that does not fit the plugin model
You need defense-in-depth against adversarial agents (NemoClaw assumes some trust)
You want to minimize dependencies (OpenShell alone is lighter)

NemoClaw is early-stage infrastructure. The plugin API and blueprint schema will likely change. Treat it as a reference implementation, not a stable platform.