NVIDIA released NemoClaw as a reference stack for running persistent AI agents inside hardened sandboxes. The project addresses the gap between agent demos and production deployments by wrapping OpenClaw and Hermes agents in NVIDIA OpenShell containers with managed inference routing, network policy enforcement, and blueprint-based lifecycle control.
NemoClaw is not an agent framework. It is infrastructure plumbing that sits between your host system and the agent runtime, providing isolation, state persistence, and a single CLI for onboarding, starting, stopping, and upgrading agents without rewriting code.
What Problem NemoClaw Solves
Most agent frameworks assume you run them in a trusted environment. They call APIs, execute shell commands, read files, and modify state with the same privileges as the user who launched them. This works for demos but breaks down when you want to leave an agent running overnight, hand it credentials, or deploy it in a regulated environment.
NemoClaw provides:
- Sandbox isolation via OpenShell containers with restricted syscalls and filesystem access
- Routed inference so agents cannot bypass your model gateway or call arbitrary endpoints
- Network policy to limit egress and prevent data exfiltration
- Blueprint lifecycle to manage agent state, configuration, and upgrades without manual container wrangling
- Host-side state persistence so agents survive restarts without losing context
The stack supports OpenClaw (default) and Hermes agents out of the box. You switch between them by setting an environment variable before install or using the nemohermes alias.
Architecture: Blueprints, Plugins, and Routing Layers
NemoClaw uses a three-layer architecture:
- Host-side CLI manages blueprints, provisions sandboxes, and persists state outside the container
- OpenShell sandbox runs the agent with restricted capabilities and enforced network policy
- Plugin layer inside the sandbox handles agent-specific setup, inference routing, and tool execution
Blueprint Lifecycle
A blueprint is a declarative specification for an agent deployment. It defines:
- Which agent binary to run (OpenClaw or Hermes)
- Inference endpoint configuration (local model, API gateway, or routed proxy)
- Network policy rules (allowed egress domains, blocked ports)
- Filesystem mounts (read-only config, writable state directory)
- Resource limits (CPU, memory, GPU allocation)
When you run nemoclaw start, the CLI:
- Reads the blueprint from
~/.nemoclaw/blueprints/<agent-name>.yaml - Provisions an OpenShell sandbox with the specified constraints
- Mounts host-side state into
/workspace/stateinside the container - Injects environment variables for inference routing and API keys
- Starts the agent process and attaches logging
When you run nemoclaw stop, the CLI:
- Sends SIGTERM to the agent process
- Waits for graceful shutdown (default 30 seconds)
- Persists any in-memory state to the host-side state directory
- Tears down the sandbox but keeps the blueprint and logs
This differs from standard container orchestration because the blueprint is versioned and the CLI handles upgrades by diffing the old and new blueprint, migrating state, and restarting the sandbox with minimal downtime.
Routed Inference
Routed inference means the agent does not call model APIs directly. Instead, it sends requests to a local proxy running inside the sandbox. The proxy:
- Validates the request against a schema (model name, max tokens, temperature)
- Checks the blueprint’s allowed model list
- Routes the request to the configured backend (NVIDIA NIM, OpenAI, local vLLM)
- Logs the request and response for observability
- Returns the result to the agent
This gives you a single point of control for model access. You can swap backends, enforce rate limits, or inject prompt guards without changing agent code.
// Inside the NemoClaw plugin (simplified)
export class InferenceRouter {
constructor(private config: BlueprintConfig) {}
async route(request: InferenceRequest): Promise<InferenceResponse> {
if (!this.config.allowedModels.includes(request.model)) {
throw new Error(`Model ${request.model} not in blueprint allowlist`);
}
const backend = this.selectBackend(request.model);
const response = await backend.complete(request);
await this.logRequest(request, response);
return response;
}
private selectBackend(model: string): InferenceBackend {
// Route based on model name prefix or explicit mapping
if (model.startsWith("nim/")) {
return this.nimBackend;
}
return this.defaultBackend;
}
}
The router runs inside the sandbox, so even if the agent is compromised, it cannot bypass the routing layer or call external APIs directly.
Network Policy Enforcement
NemoClaw uses OpenShell’s network namespace isolation to enforce egress rules. The blueprint specifies:
- Allowed domains (e.g.,
api.openai.com,github.com) - Blocked ports (e.g., no outbound SMTP on port 25)
- DNS resolution policy (use host resolver or sandbox-local DNS)
The OpenShell runtime translates these rules into iptables or nftables rules inside the sandbox. The agent sees a normal network stack but cannot reach disallowed destinations.
Example blueprint snippet:
network:
egress:
allowedDomains:
- api.openai.com
- github.com
blockedPorts:
- 25 # SMTP
- 3389 # RDP
dns:
resolver: host
This does not prevent all exfiltration (an agent could encode data in allowed HTTP requests), but it raises the bar and makes accidental leaks harder.
State Management and Persistence
NemoClaw persists agent state in three places:
| State Type | Location | Lifecycle |
|---|---|---|
| Blueprint config | ~/.nemoclaw/blueprints/ | Survives sandbox restarts, manually versioned |
| Agent working state | ~/.nemoclaw/state/<agent-name>/ | Mounted into sandbox, survives restarts |
| Logs and telemetry | ~/.nemoclaw/logs/<agent-name>/ | Rotated daily, retained for 7 days by default |
The agent writes to /workspace/state inside the sandbox. This directory is bind-mounted from the host, so writes persist even if the sandbox crashes. The CLI handles state migration during blueprint upgrades by:
- Stopping the old sandbox
- Running a migration script defined in the new blueprint
- Starting the new sandbox with the migrated state
This is similar to database migrations but for agent memory and context.
Multi-Agent Support Without Code Changes
NemoClaw supports OpenClaw and Hermes agents through a plugin system. Each plugin implements:
setup(): Install dependencies, configure environmentstart(): Launch the agent processstop(): Graceful shutdownhealthCheck(): Readiness probe
The CLI loads the plugin based on the NEMOCLAW_AGENT environment variable or the blueprint’s agent.type field. You can add a new agent by:
- Writing a plugin that implements the interface
- Adding the plugin to
~/.nemoclaw/plugins/ - Creating a blueprint that references the plugin
No changes to the CLI or core runtime are required. This abstraction lets infrastructure teams standardize on NemoClaw while letting agent developers iterate independently.
Failure Modes and Attack Surface
NemoClaw reduces but does not eliminate agent risk. Known failure modes:
- Prompt injection: The inference router does not inspect prompt content, so an agent can still be tricked into leaking data through allowed API calls
- Resource exhaustion: An agent can consume all allocated CPU or memory, requiring manual intervention
- State corruption: If an agent writes malformed data to
/workspace/state, the next restart may fail - Blueprint misconfiguration: Overly permissive network rules or model allowlists weaken isolation
The remaining attack surface includes:
- The OpenShell sandbox itself (kernel exploits, container escapes)
- The inference router (bugs in request validation or backend selection)
- Host-side state directory (if an attacker gains write access, they can inject malicious state)
NemoClaw assumes you trust the agent code enough to run it in a sandbox. It does not provide defense-in-depth against a fully adversarial agent.
Deployment Shape
NemoClaw runs on a single host. It is not a distributed system. Typical deployment:
- One NemoClaw instance per physical or virtual machine
- One agent per NemoClaw instance (multi-agent support is experimental)
- Inference backend can be local (vLLM, Ollama) or remote (NVIDIA NIM, OpenAI)
- State and logs stored on local disk or network-attached storage
For multi-host deployments, you run multiple NemoClaw instances and coordinate them with an external orchestrator (Kubernetes, Nomad, or a custom control plane). NemoClaw does not provide clustering or leader election.
Observability Hooks
NemoClaw logs to stdout and writes structured logs to ~/.nemoclaw/logs/. Each log entry includes:
- Timestamp
- Agent name
- Event type (start, stop, inference request, error)
- Request ID (for tracing inference calls)
You can forward logs to an external system (Loki, Elasticsearch, CloudWatch) by tailing the log files or configuring a sidecar.
The CLI also exposes a nemoclaw status command that shows:
- Agent uptime
- Last inference request timestamp
- Resource usage (CPU, memory, GPU)
- Health check status
There is no built-in metrics endpoint, but you can scrape resource usage from the OpenShell runtime or instrument the plugin layer.
When to Use NemoClaw vs. OpenShell Alone
| Scenario | Use NemoClaw | Use OpenShell Alone |
|---|---|---|
| Running a supported agent (OpenClaw, Hermes) | Yes | No |
| Need managed inference routing | Yes | No |
| Want blueprint-based lifecycle management | Yes | No |
| Running a custom agent with unique setup | Maybe (write a plugin) | Yes |
| Need multi-host orchestration | No (use Kubernetes + OpenShell) | Yes |
| Want maximum control over sandbox config | No | Yes |
NemoClaw is a convenience layer. If you need fine-grained control or are running an unsupported agent, use OpenShell directly and build your own lifecycle tooling.
Technical Verdict
Use NemoClaw when:
- You are deploying OpenClaw or Hermes in production and need a reference stack for isolation and lifecycle management
- You want to enforce network policy and inference routing without modifying agent code
- You need state persistence and blueprint-based upgrades
- You are running agents on a single host or a small number of hosts
Avoid NemoClaw when:
- You need multi-host orchestration (use Kubernetes with OpenShell containers)
- You are running a custom agent that does not fit the plugin model
- You need defense-in-depth against adversarial agents (NemoClaw assumes some trust)
- You want to minimize dependencies (OpenShell alone is lighter)
NemoClaw is early-stage infrastructure. The plugin API and blueprint schema will likely change. Treat it as a reference implementation, not a stable platform.