mech.app
AI Agents

City-State vs. Federation: Two Governance Models for Multi-Agent Coding Systems

Deterministic kernel governance versus DAG TOML verification for coordinating multiple coding agents without merge chaos or conflicting writes.

Source: dev.to
City-State vs. Federation: Two Governance Models for Multi-Agent Coding Systems

When you run multiple coding agents against the same repository, you need a governance model. Not a workflow engine or a task queue, but an actual control plane that decides who gets to write what, when, and how conflicts get resolved.

Two architectures have emerged independently: dgov’s deterministic kernel model and DAG TOML verification stacks. One is a city-state (centralized kernel with controlled delegation), the other is a federation (autonomous agents with verification checkpoints). Both solve the same problem. Neither is a subset of the other.

The Problem Space

Multi-agent coding systems fail in predictable ways:

  • Conflicting writes: Agent A refactors a module while Agent B adds a feature that depends on the old structure
  • Invisible dependencies: Agent C’s test suite assumes Agent D’s database migration hasn’t run yet
  • Audit gaps: No one can reconstruct why a change was made or which agent made it
  • Rollback chaos: Reverting one agent’s work breaks three others’ assumptions

Traditional git workflows assume human coordination. Agents don’t read Slack. They need machine-enforceable boundaries.

City-State Architecture: dgov

dgov implements a deterministic kernel that sits between agents and the repository. Every agent action goes through the kernel, which enforces a charter and maintains an append-only ledger.

Core Mechanisms

Git worktrees as isolation boundaries. Each agent gets its own worktree. The kernel controls merges back to main. Agents never touch each other’s workspace directly.

File claims as write locks. Before an agent modifies a file, it must claim it in the plan. The kernel rejects overlapping claims. If Agent A claims src/auth.py, Agent B cannot touch it until A’s work is merged or abandoned.

Append-only ledger. Every action (claim, write, merge, rollback) gets logged with a hash chain. You can reconstruct the entire decision history. The ledger is the source of truth, not git history alone.

Fail-closed governor. The charter document (governor.md) specifies rules like “Plan first. Respect file claims. Fail closed.” If an agent tries something not explicitly allowed, the kernel blocks it.

Implementation Shape

# Simplified dgov claim enforcement
class Kernel:
    def __init__(self, repo_path, ledger_path):
        self.repo = git.Repo(repo_path)
        self.ledger = AppendOnlyLedger(ledger_path)
        self.active_claims = {}  # file_path -> agent_id
    
    def claim_files(self, agent_id, file_paths, plan_hash):
        conflicts = [f for f in file_paths if f in self.active_claims]
        if conflicts:
            return ClaimRejected(conflicts, self.active_claims)
        
        # Atomic claim registration
        claim_id = self.ledger.append({
            "type": "claim",
            "agent": agent_id,
            "files": file_paths,
            "plan": plan_hash,
            "timestamp": time.time()
        })
        
        for path in file_paths:
            self.active_claims[path] = agent_id
        
        return ClaimAccepted(claim_id)
    
    def merge_worktree(self, agent_id, worktree_id):
        # Verify all modified files were claimed
        diff = self.repo.git.diff(worktree_id, "main", name_only=True)
        modified = diff.split("\n")
        
        unclaimed = [f for f in modified 
                     if self.active_claims.get(f) != agent_id]
        if unclaimed:
            return MergeRejected(unclaimed)
        
        # Deterministic merge with ledger entry
        merge_hash = self.repo.git.merge(worktree_id)
        self.ledger.append({
            "type": "merge",
            "agent": agent_id,
            "worktree": worktree_id,
            "hash": merge_hash
        })
        
        # Release claims
        for path, owner in list(self.active_claims.items()):
            if owner == agent_id:
                del self.active_claims[path]

The kernel is 20,000 lines of Python across 70 modules. It includes a benchmark suite and 70 test files. The design assumes you trust the kernel but not the agents.

Federation Architecture: DAG TOML Verification

The DAG TOML model treats agents as autonomous units that submit plans for verification before execution. No central kernel. Instead, a verification stack checks plans against a dependency graph and a set of validators.

Core Mechanisms

Plans as TOML claims. Each agent writes a plan file declaring what it will do, which files it will touch, and what dependencies it assumes. The plan is a contract.

DAG verification. A directed acyclic graph tracks dependencies between plans. If Plan B depends on Plan A’s output, the verifier ensures A completes before B starts. Cycles are rejected.

Validator plugins. Each plan type (refactor, feature, migration) has a validator that checks preconditions. Validators are pluggable. You can add a “no overlapping file writes” validator or a “database schema compatibility” validator.

Fleet control plane. A separate service coordinates agent execution based on verified plans. The control plane is dumb: it just schedules work. The intelligence is in the validators.

Implementation Shape

# Example plan file
[plan]
id = "refactor-auth-module"
agent = "agent-7"
type = "refactor"
dependencies = ["migration-add-oauth-table"]

[claims]
files = ["src/auth.py", "src/oauth.py"]
tests = ["tests/test_auth.py"]

[preconditions]
schema_version = "2.3.0"
feature_flags = ["oauth_enabled"]

[postconditions]
tests_pass = true
coverage_delta = 0  # Don't reduce coverage

The verification stack runs before the agent executes:

# Simplified DAG verification
class PlanVerifier:
    def __init__(self, dag, validators):
        self.dag = dag  # NetworkX DiGraph
        self.validators = validators  # List of validator plugins
    
    def verify_plan(self, plan_toml):
        plan = toml.loads(plan_toml)
        
        # Check for cycles
        self.dag.add_node(plan["plan"]["id"])
        for dep in plan["plan"].get("dependencies", []):
            self.dag.add_edge(dep, plan["plan"]["id"])
        
        if not nx.is_directed_acyclic_graph(self.dag):
            return VerificationFailed("Cycle detected")
        
        # Run type-specific validators
        plan_type = plan["plan"]["type"]
        validator = self.validators.get(plan_type)
        if not validator:
            return VerificationFailed(f"No validator for {plan_type}")
        
        result = validator.check(plan)
        if not result.passed:
            return VerificationFailed(result.reason)
        
        return VerificationPassed(plan["plan"]["id"])

The fleet control plane is separate. It reads verified plans and schedules agents. If a plan fails verification, the agent never runs.

Architectural Trade-offs

Dimensiondgov (City-State)DAG TOML (Federation)
Conflict preventionKernel enforces file claims at claim timeValidators check file overlap at verification time
Rollback modelLedger-based, can replay or revert by hashPlan-based, revert by removing plan from DAG
Agent autonomyLow, kernel controls all writesHigh, agents self-govern within verified plans
Audit trailAppend-only ledger with hash chainPlan history plus execution logs
Failure modeKernel becomes bottleneck or single point of failureValidator bugs let bad plans through
Coordination costCentralized, scales to kernel capacityDistributed, scales to validator complexity
Trust boundaryTrust kernel, not agentsTrust validators, agents are untrusted

When Each Model Fits

Use dgov (city-state) when:

  • You have a single repository with high contention
  • You need deterministic replay for compliance or debugging
  • Agent count is moderate (under 50 concurrent)
  • You can afford a central coordination point
  • File-level write locks are sufficient granularity

Use DAG TOML (federation) when:

  • You have multiple repositories or loosely coupled modules
  • Agents need to work independently and sync later
  • Agent count is high or unbounded
  • You need custom validation logic per plan type
  • You want agents to propose plans humans can review before execution

Complementary, Not Competing

The two models solve different parts of the problem. dgov prevents conflicts by controlling the write path. DAG TOML prevents conflicts by verifying plans before execution. You could run both: verify plans with DAG TOML, then execute them through a dgov kernel.

The real insight is that multi-agent coding systems need governance, not just orchestration. Workflow engines (Temporal, Prefect) handle task sequencing. These systems handle who gets to write what. That is a different problem.

Technical Verdict

Choose dgov if you need deterministic, auditable control over a single high-value repository and can tolerate a central coordination point. The kernel model gives you strong guarantees at the cost of a bottleneck.

Choose DAG TOML if you need flexible, distributed coordination across multiple repositories or agent teams. The verification model gives you autonomy at the cost of validator complexity.

Avoid both if your agents are not writing code or if you have fewer than three agents. The governance overhead is not worth it. Use a task queue.

The failure mode for dgov is kernel overload or kernel bugs that block all agents. The failure mode for DAG TOML is validator gaps that let conflicting plans through. Plan accordingly.

Tags

agentic-ai orchestration infrastructure

Primary Source

dev.to