mech.app
Financial

Rocky's Branch-Replay SQL Engine: How Financial Agents Need Temporal Query Isolation

Rocky's git-like branching for SQL exposes a critical gap in financial agent infrastructure: reproducible query environments without polluting production.

Source: github.com
Rocky's Branch-Replay SQL Engine: How Financial Agents Need Temporal Query Isolation

Financial agents that explore hypothetical trades, backtest strategies, or generate compliance reports face a plumbing problem: how do you let an agent run 15 chained SQL transformations, replay them at a specific timestamp, and audit every column’s lineage without polluting production state or losing the audit trail?

Rocky is a Rust SQL engine that treats queries like git commits. You branch, replay, and merge transformations while tracking column-level lineage and per-environment masking. It ships as a static binary, a Dagster integration (dagster-rocky on PyPI), and a VS Code extension. Built in one month with public shipping cadence, it exposes a gap in financial agent infrastructure: reproducible query environments.

The Branch Model for Query Isolation

Rocky’s core abstraction is the branch. Every SQL transformation creates a new branch pointer, not a mutated table. An agent exploring a portfolio rebalancing strategy can:

  1. Branch from main to agent/rebalance-2026-05-21
  2. Run 10 SQL transformations that join market data, risk models, and position snapshots
  3. Inspect the final state without touching production
  4. Replay the entire branch to yesterday’s market close to compare outcomes
  5. Merge back to main only if compliance checks pass

The isolation boundary is explicit. A Dagster pipeline calling Rocky sees branches as first-class resources. A human analyst in VS Code can inspect the same branch without locking or race conditions. The engine tracks which transformations ran, when, and on what input state.

Replay semantics: Rocky replays a branch by re-executing every transformation from the branch point forward, using the data state at the target timestamp. If your market data feed has gaps or corrections, replay fails loudly rather than silently producing wrong results. You get a diff of missing or corrected rows, not a corrupted portfolio valuation.

Column Lineage for Compliance Audits

When an agent chains 15 SQL transformations and one produces a compliance violation (say, a position that exceeds risk limits), you need to trace which input columns contributed to the violating output column.

Rocky’s lineage system tracks:

  • Source columns: Which raw tables and columns fed into each transformation
  • Transformation logic: The SQL expression that produced each output column
  • Governance metadata: An 8-field system that includes classification (PII, financial, public), per-environment masking rules, and access policies

If an agent’s final SELECT includes a customer_ssn column in a production report, Rocky flags it at compile time. The lineage graph shows that customer_ssn came from raw.customers, was classified as PII, and should have been masked in the prod environment.

Per-environment masking: An agent testing a strategy on real PII in dev sees actual values. The same query deployed to prod sees hashed or redacted values. The SQL logic is identical; the masking layer injects transformations based on environment and column classification.

Architecture: Typed Compiler and Adapter Layer

Rocky compiles SQL to a typed intermediate representation before execution. The compiler:

  • Validates column references and types across branches
  • Infers lineage by walking the AST of each transformation
  • Applies governance rules (classification, masking) at compile time
  • Generates execution plans that adapters translate to warehouse-specific SQL

Adapters connect Rocky to Databricks, Snowflake, BigQuery, and DuckDB. Each adapter translates Rocky’s execution plan to the warehouse’s SQL dialect and handles result streaming. The adapter layer isolates Rocky’s branch semantics from warehouse-specific quirks (transaction isolation levels, temp table lifetimes, query cost accounting).

Cost tracking: Rocky tracks per-model cost by instrumenting adapter calls. When an agent runs a branch with 10 transformations, you see the cost breakdown by model and warehouse. This matters for financial agents that run thousands of backtests: you need to know which transformations are expensive before you scale.

Dagster Integration for Multi-Step Pipelines

The dagster-rocky package exposes Rocky branches as Dagster assets. A typical financial agent pipeline:

  1. Ingest: Dagster asset pulls market data from an API, writes to Rocky branch ingest/2026-05-21
  2. Transform: Dagster asset runs SQL transformations on the ingest branch, creates transform/2026-05-21
  3. Validate: Dagster asset runs compliance checks, creates validate/2026-05-21
  4. Merge: If validation passes, Dagster merges validate/2026-05-21 to main

Each step is a separate Dagster asset with explicit dependencies. If validation fails, the pipeline stops and the agent logs the lineage of the violating column. The branch never touches main.

State management: Dagster tracks which branches exist and which have been merged. Rocky tracks the data state within each branch. The integration keeps these two state machines in sync: if a Dagster run fails, Rocky rolls back the branch; if a Rocky replay fails, Dagster marks the asset as failed.

Failure Modes and Observability

Rocky’s branch model introduces new failure modes:

Failure ModeSymptomMitigation
Branch divergenceAgent branch and production branch have conflicting schemasRocky’s typed compiler catches schema mismatches at merge time
Replay gapsMarket data feed has missing timestampsReplay fails with explicit diff of missing rows
Lineage explosion100-step transformation chain produces unreadable lineage graphRocky prunes intermediate columns that don’t contribute to final output
Adapter timeoutWarehouse query exceeds timeout during replayRocky streams results incrementally; adapter retries with backoff
Merge conflictTwo agents try to merge conflicting transformations to mainRocky uses optimistic locking; second merge fails with conflict details

Observability: Rocky logs every branch operation (create, replay, merge) with timestamps, input state hashes, and output row counts. The Dagster integration surfaces these logs as asset metadata. You can trace an agent’s entire query history from a single Dagster run ID.

Security Boundaries

Rocky’s governance system enforces security at three layers:

  1. Compile-time classification: Columns are classified (PII, financial, public) in schema definitions. The compiler rejects queries that leak classified columns to unauthorized environments.
  2. Per-environment masking: The adapter layer injects masking transformations based on environment. An agent in dev sees real PII; the same agent in prod sees hashed values.
  3. Branch-level access control: Rocky can restrict which users or agents can create, replay, or merge branches. A junior analyst can create branches but not merge to main; a compliance agent can replay any branch but not modify it.

Audit trail: Every branch operation is logged with the user or agent ID, the SQL transformations, and the input/output state hashes. If an agent produces a compliance violation, you can replay the exact branch state and inspect the lineage.

Code Example: Branching and Replay

// Create a branch for an agent's exploratory query
let branch = rocky.create_branch("agent/rebalance-2026-05-21", "main")?;

// Run SQL transformations on the branch
rocky.execute(
    &branch,
    "CREATE TABLE positions_adjusted AS 
     SELECT * FROM positions 
     WHERE risk_score < 0.8"
)?;

// Replay the branch to yesterday's market close
let replay_branch = rocky.replay(
    &branch,
    Timestamp::parse("2026-05-20T16:00:00Z")?
)?;

// Compare outcomes
let diff = rocky.diff(&branch, &replay_branch)?;
println!("Position changes: {:?}", diff);

// Merge to main if validation passes
if validate_compliance(&branch)? {
    rocky.merge(&branch, "main")?;
}

The replay call re-executes every transformation from the branch point using the data state at the target timestamp. If the market data feed has gaps, replay returns an error with the missing row details.

Trade-Offs: When Branch Semantics Help and When They Hurt

ScenarioBranch Model HelpsBranch Model Hurts
Agent explores 50 hypothetical tradesIsolates exploration from production; easy rollbackReplay cost scales with transformation count
Compliance audit of agent’s past decisionsFull lineage and replay to exact timestampStorage cost for branch history
Multi-agent collaboration on shared datasetEach agent gets isolated branch; explicit mergeMerge conflicts require manual resolution
Real-time trade executionBranch overhead is acceptable for audit trailLatency-sensitive paths may need direct writes

Rocky’s branch model trades execution overhead for reproducibility. If your agent needs to replay a 100-step transformation chain, you pay the cost of re-executing every step. If your agent needs sub-millisecond trade execution, branching adds latency.

Technical Verdict

Use Rocky when:

  • Your financial agents need to explore hypothetical scenarios without polluting production data
  • You need to replay an agent’s query history to a specific timestamp for compliance audits
  • Your agents chain multiple SQL transformations and you need column-level lineage for debugging
  • You’re integrating with Dagster or another orchestrator that treats data transformations as assets

Avoid Rocky when:

  • Your agents need sub-millisecond query latency for real-time trade execution
  • Your transformation chains are short (1-3 steps) and lineage is obvious
  • Your warehouse already provides branch semantics (e.g., Snowflake Time Travel with custom tooling)
  • Your agents only read data and never write exploratory transformations

Rocky exposes a gap in financial agent infrastructure: most warehouses treat queries as stateless operations, but agents need stateful, reproducible query environments. The branch model gives you isolation, replay, and lineage at the cost of execution overhead. For agents that explore, backtest, and audit, that trade-off is worth it.