Trigger.dev V2: What a Temporal Alternative for TypeScript Reveals About Durable Execution Plumbing

Trigger.dev launched in February 2023 as a “developer-first Zapier alternative” and earned 745 Hacker News points. Eight months later, the team shipped V2 as a “Temporal alternative for TypeScript” (172 points). That pivot exposes the architectural gap between event-driven automation and durable execution, and it matters for anyone building agent systems that need guaranteed completion across retries, crashes, and long-running operations.

The shift was driven by user feedback. Developers wanted workflows that survive failures, not just webhook routing. That requirement changes everything about state management, retry semantics, and execution guarantees.

What Changed Between V1 and V2

V1 architecture:

Event triggers (webhooks, schedules, API calls)
Stateless handlers
No built-in retry or durability
Zapier-style connector model

V2 architecture:

Durable task execution
Persistent state across retries
Execution guarantees (at-least-once, idempotency boundaries)
TypeScript-native workflow definitions

The V2 model positions Trigger.dev against Temporal, which uses event sourcing in Go with language-specific SDKs. Trigger.dev bets on TypeScript end-to-end, which simplifies the stack but constrains runtime flexibility.

Durable Execution Plumbing

Durable execution means a task can pause, crash, retry, and resume without losing progress. The system must:

Persist state at checkpoints so retries start from the last known good position
Guarantee idempotency so duplicate executions don’t corrupt state
Handle partial failures where some steps succeed and others fail
Provide observability into execution history and current state

Trigger.dev implements this with:

Task definitions that declare retry policies, timeouts, and concurrency limits
Checkpointing at tool call boundaries (for AI agents) or explicit await points
Queue-based execution with visibility into pending, running, and failed tasks
Real-time connections for frontend apps to subscribe to task progress

Here’s a minimal task with retry semantics:

import { task } from "@trigger.dev/sdk/v3";

export const processDocument = task({
  id: "process-document",
  retry: {
    maxAttempts: 5,
    factor: 2,
    minTimeoutInMs: 1000,
    maxTimeoutInMs: 60000,
  },
  run: async (payload: { url: string }) => {
    // Step 1: Download (checkpointed)
    const file = await downloadFile(payload.url);
    
    // Step 2: Parse (checkpointed)
    const parsed = await parseDocument(file);
    
    // Step 3: Store (checkpointed)
    return await storeResult(parsed);
  },
});

If parseDocument fails on attempt 2, the system retries from that checkpoint without re-downloading. The execution history is stored, so you can inspect which step failed and why.

State Persistence vs. Event Sourcing

Temporal uses event sourcing: every state transition is an immutable event. Replay the event log, and you reconstruct the workflow state. This gives strong consistency and audit trails but requires a Go runtime and external storage (Cassandra, PostgreSQL, MySQL).

Trigger.dev uses checkpoint-based persistence: the system snapshots state at defined points and stores it in a managed database. This is simpler to operate but less granular. You lose inter-checkpoint visibility unless you add explicit logging.

Aspect	Temporal	Trigger.dev
State recovery	Replay full event log	Restore from last checkpoint
Audit granularity	Every state transition	Explicit checkpoint boundaries
Storage model	External (Cassandra, Postgres)	Managed (platform-provided)
Language runtime	Go core + SDKs	TypeScript end-to-end
Operational complexity	High (cluster, storage, workers)	Low (managed platform)
Execution visibility	Complete history	Checkpoint + logs
Cost model	Self-hosted infrastructure + ops team	Managed SaaS (see pricing page for current rates)

For agent systems, the trade-off is between operational overhead and execution transparency. If you need to debug why an LLM made a specific tool call, event sourcing gives you the full conversation history. Checkpoints give you “before” and “after” snapshots.

Execution Guarantees and Idempotency

Trigger.dev provides at-least-once execution: a task will run to completion or fail permanently, but it may run multiple times if retries occur. This requires idempotent operations.

Idempotency boundaries are where the system guarantees no duplicate side effects:

Tool calls in AI agents: Each tool invocation gets a unique ID. If the agent crashes mid-execution, replaying the workflow skips already-completed tool calls.
External API calls: Wrap calls in idempotency keys (e.g., Stripe’s idempotency_key header).
Database writes: Use upserts or conditional inserts based on task run IDs.

Example with idempotent payment processing:

export const processPayment = task({
  id: "process-payment",
  run: async ({ orderId, amount }: { orderId: string; amount: number }) => {
    // Use task run ID as idempotency key
    const idempotencyKey = `${orderId}-${context.run.id}`;
    
    // Stripe won't charge twice with same key
    const charge = await stripe.charges.create({
      amount,
      currency: "usd",
      source: "tok_visa",
      idempotency_key: idempotencyKey,
    });
    
    // Database write uses orderId as unique constraint
    await db.orders.upsert({
      where: { id: orderId },
      update: { chargeId: charge.id, status: "paid" },
      create: { id: orderId, chargeId: charge.id, status: "paid" },
    });
    
    return { chargeId: charge.id };
  },
});

If the task crashes after the Stripe charge but before the database write, the retry will skip the charge (idempotency key prevents duplicate) and complete the database update.

Queue vs. Workflow Orchestration

Trigger.dev routes tasks through queues with concurrency limits. This is different from Temporal’s workflow orchestration model, where workflows spawn child workflows and activities.

Trigger.dev’s queue model:

Tasks are independent units
Concurrency controlled per queue
No parent-child workflow hierarchy
Simpler mental model for isolated jobs

Temporal’s workflow model:

Workflows compose into trees
Parent workflows wait for children
Activities are leaf nodes (actual work)
More expressive for complex orchestration

For AI agents, the queue model works when each agent run is independent. If you need hierarchical orchestration (e.g., a coordinator agent spawning specialist agents), you either implement it in application code or use Temporal’s native workflow composition.

Observability and Failure Modes

Trigger.dev provides:

Real-time task monitoring: See running, queued, and failed tasks
Execution traces: Step-by-step logs with timestamps
Retry history: Which attempts failed and why
Real-time subscriptions: Frontend apps can listen to task progress via WebSockets

Common failure modes:

Non-idempotent side effects: Retries cause duplicate charges, emails, or API calls. Fix: Add idempotency keys.
Checkpoint bloat: Storing large objects at every checkpoint slows recovery. Fix: Store references, not full payloads.
Timeout mismatches: Task timeout shorter than LLM response time. Fix: Set timeouts above worst-case latency.
Concurrency limits: Queue backs up when tasks run slower than arrival rate. Fix: Scale workers or increase concurrency.

To mitigate checkpoint bloat, store references instead of full payloads:

export const processLargeFile = task({
  id: "process-large-file",
  run: async (payload: { fileUrl: string }) => {
    // Download and upload to S3, store reference only
    const file = await downloadFile(payload.fileUrl);
    const s3Url = await uploadToS3(file);
    
    // Checkpoint stores URL, not file buffer
    const parsed = await parseDocument(s3Url);
    
    return { resultUrl: s3Url, parsed };
  },
});

Deployment Shape

Trigger.dev is a managed platform. You deploy tasks by pushing code to the platform, which handles:

Worker provisioning
Queue management
State storage
Retry scheduling

Self-hosting is possible but requires running the full stack (API server, workers, database, queue). The managed option removes operational overhead but locks you into the platform. For self-hosting, expect to provision at minimum: a PostgreSQL instance, Redis for queues, and worker nodes with auto-scaling. The operational cost resembles running a small Kubernetes cluster with stateful services.

Temporal requires running a cluster (server, workers, storage). You control the infrastructure but pay the operational cost. A production Temporal deployment typically needs a three-node server cluster, Cassandra or PostgreSQL with replication, and separate worker pools per task queue.

Technical Verdict

Use Trigger.dev if:

You’re TypeScript-native and don’t need polyglot workflows
Tasks are mostly independent without deep parent-child orchestration
You want managed infrastructure and can accept platform lock-in
Checkpoint-level visibility is sufficient for debugging and compliance. As shown in the state persistence comparison table, Trigger.dev gives you before/after snapshots at explicit checkpoint boundaries rather than every state transition. For most AI agent workflows (tool calling, multi-step reasoning), this granularity is adequate.
You’re building AI agents with tool calling and need fast iteration

Use Temporal if:

You need multi-language support (Go, Java, Python, .NET) in the same workflow
Workflows require complex hierarchies with parent-child coordination
Full event sourcing is critical for audit trails or deterministic replay. The event log model (every state transition recorded) provides complete execution history, unlike checkpoint-based recovery which only captures snapshots at defined boundaries.
You have the team to operate distributed systems and want infrastructure control
You need sub-checkpoint visibility into every state transition for forensic debugging or regulatory compliance

Avoid Trigger.dev when:

Workflows span multiple services in different languages
You require on-premise deployment with zero external dependencies
Compliance mandates complete execution history at sub-checkpoint granularity
You need long-running processes (weeks or months) that must survive code deployments, platform updates, and schema migrations without losing in-flight state or requiring manual intervention

Trigger.dev V2 proves that durable execution doesn’t require event sourcing or a polyglot runtime. The TypeScript-native approach lowers the barrier for teams building agent systems that need retry guarantees without Temporal’s operational complexity. The checkpoint model trades granular execution history for simplicity. For most AI agent workflows (tool calling, multi-step reasoning, human-in-the-loop approvals), that’s an acceptable trade.

The real insight is that durable execution is a spectrum. Trigger.dev sits between stateless webhooks (Zapier) and full workflow orchestration (Temporal), establishing a distinct position for TypeScript teams that need guaranteed completion without running a distributed systems cluster.