mech.app
AI Agents

Playwright CLI + SKILLs: Why Coding Agents Prefer CLI Over MCP for Browser Automation

Token-efficient CLI workflows vs. MCP for agent-driven browser automation. When to choose each, and how Playwright CLI exposes browser control.

Source: github.com
Playwright CLI + SKILLs: Why Coding Agents Prefer CLI Over MCP for Browser Automation

Microsoft recently released Playwright CLI with a SKILLS interface explicitly designed for coding agents like Claude Code and GitHub Copilot. The package sits alongside their existing Playwright MCP server and represents a clear architectural fork. The repository ranked #12 on GitHub Trending for TypeScript, signaling immediate developer interest in the approach.

The core trade-off is context window economics. According to the project documentation, “CLI invocations are more token-efficient: they avoid loading large tool schemas and verbose accessibility trees into the model context.” MCP maintains persistent browser state and rich page introspection. This is an architectural boundary decision between token efficiency and state persistence.

The Token Budget Problem

Coding agents operate under tight context constraints. A typical agent workflow involves repository structure, open files, test suite context, build output, tool schemas, and reasoning traces.

When you add browser automation via MCP, you inject full accessibility trees for the current page, tool definitions with parameter schemas, page state snapshots for self-healing logic, and session metadata. Playwright CLI sidesteps this by exposing browser control as stateless commands. The agent reads playwright-cli --help to discover available commands, then issues discrete invocations:

playwright-cli open https://demo.playwright.dev/todomvc/
playwright-cli type "Buy groceries"
playwright-cli press Enter
playwright-cli screenshot

Each command spawns a fresh process. The CLI does not maintain browser state between invocations. Instead, the agent must use stable selectors or chain commands that navigate to known states. This limits the CLI to workflows with predictable page structure but eliminates the need to track session state in the agent’s context window.

SKILLS vs. Tool Schemas and Discovery Mechanisms

Playwright CLI introduces a SKILLS installation model:

playwright-cli install --skills

This writes skill manifests to a local directory that coding agents can discover. Agents have three discovery paths:

1. Runtime introspection via --help

The agent executes playwright-cli --help and parses the output to learn available commands. Each command includes usage patterns and parameter descriptions. This is the zero-install path: agents can use Playwright CLI immediately without skill manifests.

2. Local skill manifests

The install --skills command writes structured skill descriptors that agents can read directly. The evidence does not specify the manifest format, but the intent is to provide a machine-readable alternative to parsing --help output.

3. Runtime tool registration

Some agent platforms allow CLI commands to be registered as callable tools within their execution environment. This allows the agent to invoke Playwright commands through its standard tool-calling interface rather than spawning subprocesses. The registration mechanism varies by agent platform (Claude Code uses a different protocol than GitHub Copilot), but the CLI surface remains consistent. This mode does not change the stateless execution model: each tool call still spawns an independent CLI process.

Compare this to an MCP tool definition, which includes JSON Schema for all parameters, enum constraints for selector strategies, nested objects for viewport configuration, and return type schemas with accessibility metadata. The skill approach is more compact. An MCP tool schema includes full type information for complex actions like page.evaluate() with TypeScript signatures. CLI skills expose only the command surface.

Architecture Comparison

DimensionPlaywright CLI + SKILLSPlaywright MCP
State modelStateless per invocationPersistent browser session
Context costAvoids loading schemas and accessibility treesFull page tree in context
Discovery--help, local manifests, runtime registrationMCP tool list with JSON schemas
Failure recoveryAgent retries from scratchSelf-healing via page introspection
ConcurrencyMultiple CLI processesSingle MCP session per agent
Best forHigh-throughput test generation, token-constrained agentsExploratory automation, self-healing tests, stateful workflows

When CLI Wins

Use Playwright CLI when your agent needs to generate test suites across dozens of pages, validate UI behavior in CI pipelines, take screenshots for visual regression, or operate within strict token budgets (Claude Code, Copilot).

The stateless model means you can parallelize browser actions across multiple CLI processes. Each invocation is independent. No session cleanup. No connection pooling.

Example workflow for testing a todo app:

# Agent issues these commands in sequence
playwright-cli open https://demo.playwright.dev/todomvc/
playwright-cli type "Buy groceries"
playwright-cli press Enter
playwright-cli type "Water flowers"
playwright-cli press Enter
playwright-cli screenshot

The agent parses command output (exit codes, stdout, stderr) to decide whether to retry, backtrack, or escalate. Exit code 0 signals success. Non-zero codes indicate failure. Stdout contains command results (screenshot paths, element text). Stderr contains error messages. The per-invocation execution model means failures are isolated. A bad selector in one command does not corrupt the session.

The --headed flag can be passed to the open command for visual debugging, though the CLI defaults to headless mode.

When MCP Wins

Use Playwright MCP when your agent needs to explore unfamiliar UIs with iterative refinement, implement self-healing selectors that adapt to DOM changes, maintain browser state across long reasoning loops, or introspect page structure for dynamic workflows.

MCP excels at workflows where the agent must reason about page structure. The persistent session means you can query the accessibility tree to find clickable elements, evaluate JavaScript in the page context, wait for network requests to complete, and maintain cookies and local storage across actions.

This is critical for autonomous agents that must adapt to changing UIs without human intervention.

Deployment Shape

Playwright CLI is a global npm package:

npm install -g @playwright/cli@latest

Agents invoke it as a subprocess. No server process. No connection management. The CLI binary handles browser lifecycle internally. Each invocation spawns a browser instance, executes the command, and cleans up automatically.

Playwright MCP runs as a persistent server. Agents connect via stdio or HTTP. The server maintains browser instances. Session state persists across tool calls. This requires explicit shutdown or timeout handling.

For CI environments, CLI is simpler. No daemon to manage. Each test run spawns fresh browser instances and exits cleanly.

Technical Verdict

Use it if:

  • You are building coding agents that must balance browser automation with large codebases within limited context windows (Claude Code and GitHub Copilot explicitly support the SKILLS model)
  • Your workflows involve high-throughput test generation across multiple pages where independent invocations are sufficient
  • You need to parallelize browser actions without managing persistent sessions
  • Token efficiency is more important than rich page introspection

Avoid it if:

  • Your agent requires iterative reasoning over page structure with self-healing selectors
  • Workflows depend on maintaining browser state (cookies, local storage) across multiple actions
  • You need to introspect accessibility trees or evaluate JavaScript in page context
  • Your use case benefits from persistent MCP sessions despite the token cost

CLI is optimal for token-constrained agents operating in high-throughput scenarios. MCP is optimal for stateful workflows requiring continuous browser context and adaptive page introspection. The architectural choice depends on whether your agent’s bottleneck is context window size or workflow state complexity.