mech.app
Dev Tools

CAST: Declarative Access Control for AI Agents Without Prompt Engineering

Policy-as-code for agent tool calls. Enforce fine-grained permissions outside prompts with interceptors, context evaluation, and audit trails.

Source: github.com
CAST: Declarative Access Control for AI Agents Without Prompt Engineering

Embedding access control rules in agent system prompts is a maintenance nightmare. You end up with sentences like “only allow database writes if the user is an admin and the table is not production” scattered across prompt templates. When you need to change a permission, you grep through YAML files, update three prompts, and hope the LLM still respects the constraint.

CAST separates authorization from reasoning. It intercepts tool calls before execution, evaluates declarative policies, and returns allow or deny decisions without touching your agent’s prompt. The policy engine sits between your orchestrator and the tool runtime, so you can version, test, and audit access rules independently.

The Prompt Engineering Tax

Most agent frameworks let you register tools as Python functions or API endpoints. The agent decides which tool to call based on its prompt and the user’s request. If you want to restrict who can call what, you have two bad options:

  1. Prompt-based guards: Add instructions like “never delete files for non-admin users” to the system prompt and hope the model obeys.
  2. Inline checks: Write if user.role != "admin": raise PermissionError inside every tool function.

Prompt-based guards fail silently. The agent might ignore the rule, misinterpret context, or hallucinate exceptions. Inline checks scatter authorization logic across dozens of tool implementations. Neither approach gives you a central audit log or a way to test policies in isolation.

How CAST Intercepts Tool Calls

CAST wraps your tool registry with a policy evaluation layer. When an agent requests a tool call, the framework:

  1. Captures the tool name, arguments, and execution context (user ID, session metadata, timestamp).
  2. Loads the relevant policy from a declarative rule set.
  3. Evaluates the policy against the context.
  4. Returns an allow decision and executes the tool, or returns a deny decision and logs the attempt.

The agent never sees the policy logic. It just gets a success or failure response. You can swap policies without retraining or re-prompting.

from cast import PolicyEngine, Tool, Context

# Define a tool
@Tool(name="delete_file")
def delete_file(path: str):
    os.remove(path)
    return f"Deleted {path}"

# Define a policy
policy = """
allow delete_file if:
  user.role == "admin"
  path not in ["/etc/passwd", "/var/log/system.log"]
"""

# Wrap the tool with the policy engine
engine = PolicyEngine(policies=[policy])
context = Context(user={"role": "user"}, session_id="abc123")

# Agent requests tool call
result = engine.execute("delete_file", {"path": "/tmp/data.csv"}, context)
# result.allowed = False, result.reason = "user.role != admin"

The policy engine evaluates the rule before delete_file runs. If the user lacks admin privileges, the tool never executes. The agent receives a structured denial with a reason.

Policy Language and Context Evaluation

CAST policies use a simple conditional syntax. Each rule specifies:

  • Tool name: Which function or API endpoint the rule applies to.
  • Conditions: Boolean expressions over context attributes.
  • Actions: Allow or deny, with optional logging or alerting.

Context attributes come from your orchestrator. You pass in user identity, resource ownership, time windows, rate limits, or any other metadata your tools need to make authorization decisions.

policies:
  - name: restrict_database_writes
    tool: execute_sql
    allow_if:
      - user.role in ["admin", "data_engineer"]
      - sql.operation != "DROP"
      - time.hour >= 9 and time.hour <= 17

  - name: limit_api_calls
    tool: call_external_api
    allow_if:
      - user.api_calls_today < 100
    on_deny:
      log: "Rate limit exceeded for user {user.id}"

The engine evaluates conditions in order. If all conditions pass, the tool executes. If any condition fails, the engine logs the denial and returns a structured error. You can chain policies with AND or OR logic, and you can define default-deny or default-allow behavior per tool.

Testing and Versioning Policies

Policies live in separate files from your agent code. You can version them in Git, run unit tests against them, and deploy them independently.

# test_policies.py
from cast import PolicyEngine, Context

def test_admin_can_delete():
    engine = PolicyEngine.from_file("policies.yaml")
    context = Context(user={"role": "admin"})
    result = engine.evaluate("delete_file", {"path": "/tmp/test"}, context)
    assert result.allowed

def test_user_cannot_delete():
    engine = PolicyEngine.from_file("policies.yaml")
    context = Context(user={"role": "user"})
    result = engine.evaluate("delete_file", {"path": "/tmp/test"}, context)
    assert not result.allowed

You can run these tests in CI before deploying new policies. If a policy change breaks an expected permission, the test fails before it reaches production. You can also simulate different contexts (time of day, user attributes, resource states) without invoking the actual tools.

Audit Trails and Observability

Every policy evaluation generates a structured log entry. The engine records:

  • Tool name and arguments
  • User identity and session metadata
  • Policy decision (allow or deny)
  • Reason for denial (which condition failed)
  • Timestamp and request ID

You can ship these logs to your observability stack (Datadog, Honeycomb, CloudWatch) and build dashboards around access patterns. You can alert on repeated denials, track which users hit rate limits, or audit who called sensitive tools.

{
  "timestamp": "2026-06-03T16:15:42Z",
  "tool": "delete_file",
  "user_id": "user_456",
  "session_id": "abc123",
  "decision": "deny",
  "reason": "user.role != admin",
  "policy": "restrict_file_operations",
  "request_id": "req_789"
}

This log format integrates with standard security information and event management (SIEM) tools. You can correlate agent activity with application logs, infrastructure metrics, and security alerts.

Architecture Trade-offs

AspectPrompt-Based GuardsInline ChecksCAST Policies
CentralizationScattered across promptsScattered across toolsSingle policy file
AuditabilityNo structured logsRequires custom loggingBuilt-in audit trail
TestabilityHard to unit testRequires mocking toolsPolicies test in isolation
Failure ModeSilent non-complianceRuntime exceptionsStructured denials
VersioningTied to prompt versionsTied to code deploysIndependent policy deploys
Context AwarenessLimited to prompt textFull code accessDeclarative context rules

CAST adds latency (policy evaluation happens before every tool call) and operational complexity (you need to manage policy files and the engine runtime). But it eliminates the fragility of prompt-based guards and the sprawl of inline checks.

Deployment Shape

CAST runs as a sidecar or a shared service. In a sidecar model, each agent instance runs its own policy engine. Policies load from a config file or a remote policy store. The agent calls the local engine before executing tools.

In a shared service model, all agents call a central policy API. The API loads policies from a database or a Git repository, evaluates them, and returns decisions. This centralizes policy management but adds network latency and a new failure point.

┌─────────────┐       ┌──────────────┐       ┌──────────┐
│   Agent     │──────>│ Policy Engine│──────>│   Tool   │
│ Orchestrator│       │   (CAST)     │       │ Registry │
└─────────────┘       └──────────────┘       └──────────┘
       │                      │
       │                      v
       │              ┌──────────────┐
       └─────────────>│  Audit Log   │
                      └──────────────┘

The policy engine sits between the orchestrator and the tool registry. The orchestrator sends tool call requests to the engine, which evaluates policies and forwards allowed calls to the registry. Denied calls return immediately with a reason.

Likely Failure Modes

Policy syntax errors: A typo in a policy file can break all tool calls. You need schema validation and linting in CI to catch these before deploy.

Context mismatch: If the orchestrator doesn’t pass the right context attributes, policies fail open or closed depending on your default. You need integration tests that verify context shape.

Performance bottlenecks: Evaluating complex policies on every tool call adds latency. You can cache policy decisions for idempotent tools or use a faster policy language (Rego, Cedar).

Policy drift: If policies and tools evolve separately, you can end up with orphaned rules or missing coverage. You need tooling to detect unused policies and tools without policies.

Audit log volume: High-frequency agents generate massive log volumes. You need sampling, aggregation, or a separate audit pipeline to avoid overwhelming your observability stack.

Technical Verdict

Use CAST when you have multiple agents, multiple users, or compliance requirements that demand auditable access control. It makes sense for production systems where authorization logic changes frequently and needs to be tested independently of agent prompts.

Avoid it for single-user prototypes or agents that only call read-only tools. The overhead of policy management and evaluation is not worth it if you have no access control requirements. Stick with inline checks or prompt-based guards until you feel the pain of scattered authorization logic.

CAST shines in multi-tenant environments where different users need different permissions for the same tools. It also works well in regulated industries (healthcare, finance) where you need to prove who accessed what and why.