mech.app
Dev Tools

Team Runtimes for Coding Agents: Why tmux and Shared Filesystems Aren't Enough

Multi-agent coding systems need coordination primitives, process isolation, and access control that single-agent tools can't provide at scale.

Source: dev.to
Team Runtimes for Coding Agents: Why tmux and Shared Filesystems Aren't Enough

When you move from one coding agent to three, the infrastructure breaks in ways that single-agent tools hide. The problem isn’t model capability. It’s concurrent file access, process isolation, permission boundaries, and observability when multiple agents edit the same repository simultaneously.

Basil Zakarov’s 16-minute technical breakdown on Dev.to addresses the gap between “AI pair programmer” and “AI team.” The shift exposes orchestration requirements that tmux sessions and shared directories can’t solve.

The Coordination Problem

Single-agent tools (Cursor, Copilot, Claude Code) assume one human, one agent, one workspace. Multi-agent systems need:

  • Concurrent file access control: preventing race conditions when two agents modify the same file
  • Process isolation: separating dev servers, test runners, and build processes across agents
  • Permission boundaries: defining what each agent can touch and who can observe or join sessions
  • Coordination primitives: locks, queues, or pub/sub for task handoffs and state synchronization

tmux gives you session persistence. It doesn’t give you file locking, process namespaces, or access control lists.

Maturity Levels for Team Runtimes

Zakarov outlines a progression from personal to team infrastructure:

LevelDescriptionKey Limitation
0A: Personal LocalAgents run on developer laptops in local terminalsNo session durability, no team visibility
0B: Personal ServerRemote dev servers with tmux/zellij, tools like Agent DeckNo permission separation, no cross-developer coordination
1: Small TeamOS-level permission separation, centralized management across serversManual setup, no standardized execution environment
2: Managed TeamInfrastructure-as-code, standardized environments, observabilityRequires dedicated ops investment

The jump from 0B to 1 is where most teams get stuck. You need more than session multiplexing. You need identity, access control, and resource isolation.

Architecture: SSH + tmux + sudo as a Runtime

Zakarov’s blueprint uses standard Unix tools to build a minimal team runtime:

Execution environment:

  • Dedicated dev servers (not production boxes)
  • One tmux session per agent, named by project and developer
  • OS-level users for permission boundaries (one user per developer or per agent role)

Access control:

  • SSH keys for authentication
  • sudo rules for session attachment (seniors can join junior sessions)
  • File permissions to isolate workspaces

Observability:

  • tmux list-sessions for runtime inventory
  • Shared logging directory with per-agent output streams
  • Optional: centralized log aggregation (syslog, Loki)

Coordination:

  • File-based locks for critical sections (crude but functional)
  • Shared state directory with atomic writes (rename over move)
  • Message queues for task handoffs (Redis, RabbitMQ, or filesystem-based)

This isn’t elegant. It’s operational. You get session durability, permission separation, and team visibility without building a custom orchestration layer.

Code Snippet: Session Launch with Isolation

#!/bin/bash
# launch-agent.sh: start an isolated agent session

AGENT_NAME="$1"
PROJECT="$2"
DEVELOPER="$3"

# Create isolated workspace
WORKSPACE="/var/agents/${DEVELOPER}/${PROJECT}"
mkdir -p "$WORKSPACE"
chown "${DEVELOPER}:agents" "$WORKSPACE"
chmod 750 "$WORKSPACE"

# Launch tmux session as developer user
sudo -u "$DEVELOPER" tmux new-session -d -s "${PROJECT}-${AGENT_NAME}" \
  -c "$WORKSPACE" \
  "agent-runner --project $PROJECT --workspace $WORKSPACE"

# Log session metadata
echo "$(date -Iseconds) ${DEVELOPER} ${PROJECT} ${AGENT_NAME}" \
  >> /var/log/agents/sessions.log

# Grant senior engineers attach permission
tmux set-option -t "${PROJECT}-${AGENT_NAME}" \
  allow-attach on

This script:

  • Creates a per-developer, per-project workspace with restricted permissions
  • Launches the agent as the developer’s OS user (not root)
  • Logs session creation for audit trails
  • Enables session attachment for mentoring or debugging

Failure Modes

Concurrent file edits: Two agents modify the same file. Without file-level locking or version control integration, last-write-wins. You lose work.

Process namespace collisions: Agent A starts a dev server on port 3000. Agent B tries the same. One fails silently or both bind to different interfaces.

Permission escalation: An agent running as a junior developer’s user tries to modify shared configuration. Without explicit ACLs, it either fails or succeeds based on group permissions.

Session orphaning: Developer disconnects. tmux session persists. Agent keeps running, consuming resources and potentially modifying code without supervision.

Observability gaps: Three agents debug the same stack trace. Without structured logging and session tagging, you can’t reconstruct which agent did what.

Scaling Beyond SSH and tmux

When the team grows past 5-10 developers or 20-30 concurrent agents, the Unix-tools approach hits limits:

  • Dynamic resource allocation: Kubernetes or Nomad for container-based agent isolation
  • Centralized state management: etcd or Consul for coordination primitives
  • Structured observability: OpenTelemetry for distributed tracing across agent actions
  • Policy enforcement: OPA (Open Policy Agent) for fine-grained access control

At this scale, you’re building a platform. The primitives stay the same (isolation, coordination, observability), but the implementation shifts from shell scripts to declarative infrastructure.

Technical Verdict

Use this approach when:

  • You have 2-10 developers running multiple coding agents concurrently
  • You need session durability and team visibility without custom tooling
  • You can tolerate file-based coordination and manual session management
  • Your team already operates dev servers and understands SSH/tmux workflows

Avoid this approach when:

  • You need sub-second coordination or real-time conflict resolution
  • Agents must run in ephemeral, containerized environments
  • Compliance requires audit trails beyond filesystem logs
  • You’re scaling past 20 concurrent agents or need dynamic resource allocation

The gap between single-agent tools and multi-agent infrastructure is real. SSH, tmux, and sudo fill it for small teams. Beyond that, you need orchestration primitives that treat agents as first-class distributed processes, not just persistent terminal sessions.


Tags

agentic-ai orchestration infrastructure

Primary Source

dev.to