Production agents often need to evaluate the same set of arguments under different regimes: GDPR versus CCPA, internal audit versus external compliance, or development versus production security policies. Most orchestration frameworks treat context as static metadata. A new ArXiv paper (2605.31581v1) formalizes what happens when the agent itself can strategically activate which evaluation lens applies, and when argument validity flips depending on the active context.
The work extends Dung’s classical argumentation theory with context-dependent argumentation frameworks (CDAFs). Instead of a single attack graph, you get a defeat function that changes which attacks succeed based on the active context. The agent’s action space is the relevance set: which context dimensions it chooses to activate. The result is a formal model for multi-regime reasoning where the agent controls the lens, not just the arguments.
Why This Matters for Orchestration
Most agent frameworks treat context as a prompt prefix or a configuration file. You load a policy, the agent reasons, you get a result. CDAFs expose a different pattern: the agent can switch evaluation regimes mid-execution, and the same argument graph produces different validity outcomes.
Concrete scenarios:
- A compliance agent evaluating a transaction under GDPR, then switching to CCPA when the user’s location changes.
- A security agent that applies strict rules in production but relaxed rules in staging, and can toggle between them based on deployment metadata.
- A financial agent that evaluates risk under normal market conditions, then switches to a crisis regime when volatility spikes.
The paper shows that an argument can be rejected under every full-relevance regime but accepted under partial activations. This is not a bug. It is a strategic lever. The agent can choose which context dimensions to activate to achieve a desired outcome.
Architecture: Defeat Functions and Relevance Sets
A CDAF replaces the static attack relation in Dung’s framework with a defeat function. The function takes a context and returns which attacks succeed.
Core components:
- Attack graph: Directed graph where nodes are arguments and edges are potential attacks.
- Defeat function:
defeat(context) → set of successful attacks. This is the regime boundary. - Relevance set (ρ): The subset of context dimensions the agent activates. This is the agent’s action space.
- Priority (π): A partial order over context dimensions. Determines which attacks win when multiple contexts apply.
The perspective-labeled specialization derives the defeat function from ρ and π. The agent selects which dimensions to activate, and the priority determines how conflicts resolve.
State machine view:
Initial State
↓
Agent selects relevance set ρ
↓
Defeat function computes active attacks
↓
Argumentation semantics compute accepted arguments
↓
Agent observes outcome
↓
Agent may switch ρ (context transition)
↓
Re-evaluate
This is not a one-shot evaluation. The agent can switch contexts mid-conversation, and the validity of arguments changes.
Implementation Patterns
You can implement context activation as a tool call or as a prompt instruction. Each has different failure modes.
Tool Call Approach
The agent calls a switch_context(dimensions: list[str]) tool. The orchestrator updates the active context and re-evaluates the argument graph.
Pros:
- Explicit state transition. You can log, audit, and replay context switches.
- The orchestrator controls which contexts are valid. The agent cannot hallucinate a context.
- You can enforce rate limits or approval gates on context switches.
Cons:
- Latency. Every context switch requires a tool call round-trip.
- The agent must know the context schema. If you add a new dimension, you need to update the tool description.
Failure modes:
- Agent calls
switch_contextin a loop, never converging. - Agent switches to an invalid context, tool call fails, conversation derails.
- Context switch invalidates earlier arguments, but the agent does not backtrack.
Prompt Instruction Approach
You embed the context schema in the system prompt and instruct the model to declare which context applies.
Pros:
- No tool call overhead. The model can switch contexts inline.
- Flexible. The model can invent new context dimensions if the schema allows.
Cons:
- No enforcement. The model can hallucinate contexts or ignore the schema.
- Hard to audit. Context switches are buried in the conversation, not logged as discrete events.
- Prompt injection risk. A user can manipulate the context declaration.
Failure modes:
- Model declares a context that does not exist in your defeat function.
- Model forgets the active context after a few turns.
- Model switches contexts without justification, producing inconsistent reasoning.
Computational Costs of Re-Evaluation
When the agent switches contexts, you must re-evaluate the argument graph under the new defeat function. The paper defines the ACTIVATION-MANIPULATION decision problem: given a CDAF, a target argument, and a desired acceptance status, does there exist a relevance set that achieves the outcome?
Complexity bounds:
The paper records baseline complexity bounds but leaves tight bounds open. For practical orchestration, assume re-evaluation is at least linear in the number of arguments and attacks. If your graph has thousands of nodes, re-evaluation on every context switch will dominate latency.
Optimization strategies:
- Incremental evaluation: Only re-compute the subgraph affected by the context switch.
- Caching: Precompute defeat functions for common context combinations.
- Lazy evaluation: Defer re-evaluation until the agent queries a specific argument’s status.
Trade-off table:
| Strategy | Latency | Memory | Correctness Risk |
|---|---|---|---|
| Full re-evaluation | High | Low | None |
| Incremental | Medium | Medium | Dependency tracking bugs |
| Caching | Low | High | Stale cache on schema change |
| Lazy | Variable | Low | Inconsistent intermediate state |
Security Boundaries and Manipulation
The agent’s ability to switch contexts is a strategic lever. If the agent can choose which regime applies, it can manipulate outcomes. The paper shows an example where the target argument is rejected under every full-relevance regime but accepted under a partial activation.
Security implications:
- Privilege escalation: An agent in a restricted context switches to a permissive context to bypass a policy.
- Audit evasion: An agent switches to a context with weaker logging requirements.
- Outcome steering: An agent selects the context that produces the desired result, regardless of ground truth.
Mitigation strategies:
- Context allowlists: Only permit specific context transitions. Encode this in the orchestrator, not the prompt.
- Approval gates: Require human or supervisor approval for high-stakes context switches.
- Monotonicity constraints: Ensure that switching contexts can only add restrictions, never remove them.
- Audit trails: Log every context switch with justification. Replay the conversation under different contexts to detect manipulation.
Observability Challenges
Standard agent observability focuses on tool calls, token counts, and latency. CDAFs introduce a new dimension: context transitions and their impact on argument validity.
What to instrument:
- Context switch events: Timestamp, old context, new context, triggering condition.
- Argument validity changes: Which arguments flipped from accepted to rejected (or vice versa) after a switch.
- Relevance set size: How many dimensions the agent activated. A sudden drop may indicate manipulation.
- Re-evaluation latency: Time spent recomputing the defeat function.
Failure detection:
- Context thrashing: Agent switches contexts repeatedly without converging.
- Invalid context: Agent attempts to activate a context not in the schema.
- Inconsistent reasoning: Agent’s conclusions contradict the active context’s defeat function.
Deployment Shape
CDAFs fit naturally into orchestrators that already manage state machines or workflow graphs. The defeat function is a policy engine. The relevance set is a configuration object. The agent’s context selection is a decision node.
Integration points:
- LangGraph: Add a context node that updates the defeat function. Use conditional edges to trigger re-evaluation.
- Temporal: Model context switches as signals. The workflow re-evaluates arguments in a new activity.
- Custom orchestrator: Implement the defeat function as a plugin. The agent calls a
switch_contexttool, the orchestrator updates the plugin state, and the next reasoning step uses the new function.
Deployment risks:
- Schema drift: Your context dimensions change, but the agent’s training data or prompt does not reflect the new schema.
- Multi-tenancy: Different tenants have different context schemas. The orchestrator must isolate defeat functions per tenant.
- Versioning: You update the defeat function logic, but existing conversations are mid-flight. Do you re-evaluate under the new logic or preserve the old?
Multi-Agent Variants
The paper leaves multi-agent variants open. In practice, you will have multiple agents with overlapping argument graphs and different context preferences.
Coordination patterns:
- Shared context: All agents see the same active context. One agent’s switch affects all others.
- Private context: Each agent maintains its own relevance set. Arguments are evaluated per-agent.
- Negotiated context: Agents propose context switches, and a coordinator resolves conflicts.
Failure modes:
- Context deadlock: Agent A wants context X, agent B wants context Y, neither can proceed.
- Context race: Two agents switch contexts simultaneously, producing inconsistent argument validity.
- Context leakage: Agent A’s context switch inadvertently affects agent B’s reasoning.
Technical Verdict
Use CDAFs when:
- You have well-defined evaluation regimes (compliance frameworks, security policies, business rules) and the agent must switch between them.
- Argument validity genuinely depends on external context, not just the arguments themselves.
- You can afford the computational cost of re-evaluation and the engineering cost of maintaining a defeat function schema.
- You need formal guarantees about which arguments are accepted under which contexts.
Avoid CDAFs when:
- Your context is static or changes infrequently. A simple configuration file is enough.
- Re-evaluation latency is unacceptable. If you have thousands of arguments and sub-second SLAs, incremental evaluation may not be fast enough.
- Your agent cannot reliably select contexts. If the model hallucinates or ignores the schema, the framework collapses.
- You do not have a formal argument graph. CDAFs assume you can enumerate arguments and attacks. If your reasoning is unstructured, this formalism does not apply.
The paper provides a formal foundation for multi-regime reasoning. The engineering challenge is mapping that formalism to your orchestration layer without introducing new failure modes. If you can solve that, you get agents that reason correctly under different lenses and can justify why they switched.