Trigger.dev V2: Durable Execution Plumbing for TypeScript Workflows

Trigger.dev launched in February 2023 as a “developer-first Zapier alternative” and earned 745 points on Hacker News. Eight months later, the team shipped V2 with a hard pivot: now it positions as a “Temporal alternative for TypeScript.” That shift from event-driven automation to durable execution reveals what agent builders actually need when workflows span hours, make dozens of API calls, and must survive transient failures without custom retry logic.

The V2 repositioning (172 points, October 2023) signals a real infrastructure gap. Developers writing multi-step agent orchestration don’t want visual workflow builders. They want execution guarantees that survive process crashes, network partitions, and rate-limit backoffs without losing state or duplicating side effects.

What Durable Execution Solves

Traditional job queues (Bull, BullMQ, Celery) handle discrete tasks. You enqueue work, a worker picks it up, it succeeds or fails. If it fails, you retry the entire task. State lives in Redis or Postgres, and you write custom logic to track progress across steps.

Durable execution engines flip this model. The workflow definition itself becomes the source of truth. The platform persists execution state after every step, so when a task crashes mid-flight, the runtime replays from the last checkpoint instead of starting over. This matters for agent workflows that:

Call LLMs with non-deterministic outputs you need to preserve
Hit rate-limited APIs where retrying the entire flow wastes quota
Run for hours or days (research agents, multi-stage data pipelines)
Require human-in-the-loop approvals that pause execution

Temporal pioneered this pattern in Go and Java. Trigger.dev brings it to TypeScript with a simpler mental model and tighter integration with the Node.js ecosystem.

Execution Model and State Persistence

Trigger.dev runs tasks in isolated Node.js processes. When you define a task, you wrap your logic in a task() function that returns a configuration object. The platform intercepts execution at function boundaries, persisting state after each step.

Here’s a minimal example showing the basic structure:

export const simpleTask = task({
  id: "simple-task",
  run: async ({ input }: { input: string }) => {
    const result = await someAsyncOperation(input);
    return { output: result };
  },
});

Now a more complex workflow with multiple steps:

export const processOrder = task({
  id: "process-order",
  run: async ({ orderId }: { orderId: string }) => {
    // Step 1: Fetch order (state persisted after this completes)
    const order = await db.orders.findUnique({ where: { id: orderId } });
    
    // Step 2: Charge payment (if this fails, replay starts here)
    const charge = await stripe.charges.create({
      amount: order.total,
      currency: "usd",
      source: order.paymentToken,
    });
    
    // Step 3: Send confirmation (won't duplicate if step 2 retried)
    await sendEmail({
      to: order.email,
      subject: "Order confirmed",
      body: `Charged ${charge.amount}`,
    });
    
    return { chargeId: charge.id };
  },
});

The runtime tracks which steps completed. If the Stripe call fails due to a network timeout, the next retry skips the database fetch and jumps straight to the charge attempt. This prevents duplicate charges and wasted API calls.

State persistence happens at the platform level, not in your code. You don’t serialize intermediate results to Redis or write checkpoint logic. The execution trace lives in Trigger.dev’s managed Postgres instance, indexed by task run ID.

Retry Semantics and Failure Modes

Trigger.dev exposes three retry strategies:

Strategy	Behavior	Use Case
`exponentialBackoff`	Doubles delay between retries (1s, 2s, 4s, 8s)	Transient API failures, rate limits
`linearBackoff`	Fixed delay between retries (5s, 5s, 5s)	Predictable retry cadence for flaky services
`custom`	User-defined delay function	Complex backoff with jitter or circuit breaking

You configure retries per task:

export const fetchWithRetry = task({
  id: "fetch-with-retry",
  retry: {
    maxAttempts: 5,
    factor: 2,
    minTimeoutInMs: 1000,
    maxTimeoutInMs: 60000,
  },
  run: async ({ url }: { url: string }) => {
    const response = await fetch(url);
    if (!response.ok) throw new Error(`HTTP ${response.status}`);
    return response.json();
  },
});

When a task throws an exception, Trigger.dev checks the retry policy. If attempts remain, it schedules the next run with the calculated backoff. The execution trace shows each attempt, the error message, and the delay before retry.

Failure modes to watch:

Non-idempotent side effects: If your task sends an email in step 1 and crashes in step 2, the replay sends another email. Wrap non-idempotent calls in idempotency checks (use Stripe’s idempotency_key pattern).
State drift: If external state changes between retries (e.g., a user cancels an order), your task logic must handle stale data. Check timestamps or version fields.
Infinite retries: Without a maxAttempts cap, a task can retry forever. Set limits and monitor dead-letter queues.

Concurrency and Execution Isolation

Trigger.dev isolates tasks at the process level. Each task run spawns a new Node.js worker. This prevents memory leaks from long-running tasks and ensures one task’s crash doesn’t affect others.

Concurrency limits control how many instances of a task run simultaneously:

export const bulkProcess = task({
  id: "bulk-process",
  queue: {
    concurrencyLimit: 10, // Max 10 concurrent runs
  },
  run: async ({ items }: { items: string[] }) => {
    return Promise.all(items.map(processItem));
  },
});

The platform enforces limits at the queue level. If 15 instances trigger simultaneously, 10 run immediately and 5 wait in the queue. This prevents overwhelming downstream APIs or exhausting database connection pools.

For multi-tenant workloads, you can partition concurrency by key:

export const tenantTask = task({
  id: "tenant-task",
  queue: {
    concurrencyLimit: 5,
    concurrencyKey: (payload) => payload.tenantId,
  },
  run: async ({ tenantId }: { tenantId: string }) => {
    // Process tenant-specific work
  },
});

Now each tenant gets up to 5 concurrent runs, but tenants don’t block each other.

Observability and Debugging

The Trigger.dev dashboard shows a real-time execution trace for every task run. You see:

Each step’s start time, duration, and return value
Retry attempts with error messages and backoff delays
Logs emitted during execution (via console.log or structured logging)
Parent-child relationships for tasks that trigger other tasks

For long-running workflows, this trace is critical. If an agent workflow stalls after 3 hours, you can inspect which API call timed out, what the response payload looked like, and whether the retry logic fired.

Trigger.dev also exposes OpenTelemetry spans. You can pipe traces to Datadog, Honeycomb, or Grafana for cross-service correlation. This matters when debugging workflows that call external APIs, databases, and LLMs in a single run.

Versioning and Deployment

When you update a task definition, Trigger.dev versions it automatically. Existing in-flight runs continue with the old code. New runs use the new version. This prevents mid-execution schema mismatches.

You deploy tasks by pushing code to your repository. Trigger.dev watches for changes and redeploys automatically (similar to Vercel or Netlify). For production workloads, you can pin specific versions or use blue-green deployments.

The platform doesn’t support hot-swapping code mid-execution. If a task runs for 24 hours and you deploy a bug fix, the running instance completes with the old code. Plan rollouts accordingly.

Comparison to Temporal

Dimension	Trigger.dev	Temporal
Language support	TypeScript only	Go, Java, Python, TypeScript
State persistence	Managed Postgres	Pluggable (Cassandra, Postgres, MySQL)
Execution model	Process-per-task	Worker pools with sticky sessions
Isolation vs. latency	Higher isolation, 100-500ms startup overhead	Lower latency, requires careful resource management
Deployment	Managed cloud or self-hosted	Self-hosted (complex) or Temporal Cloud
Learning curve	Low (familiar async/await)	High (workflow vs. activity distinction)
Ecosystem maturity	Early (launched 2023)	Mature (launched 2019)

Temporal offers more control and language flexibility. Trigger.dev trades that for simplicity and faster onboarding. If you’re building in TypeScript and want managed infrastructure, Trigger.dev removes operational overhead. If you need polyglot workflows or run on-prem with strict data residency, Temporal fits better.

The process-per-task model in Trigger.dev prioritizes fault isolation over dispatch speed. Each task gets a clean runtime environment, which prevents cross-task memory leaks but adds 100-500ms of process startup overhead. Temporal’s worker pools reuse processes, cutting latency at the cost of requiring more careful resource management.

When to Use Trigger.dev

Good fit:

Multi-step agent workflows that call LLMs, APIs, and databases
Long-running tasks (hours to days) that must survive crashes
Teams already in the TypeScript ecosystem (Next.js, Remix, Node.js)
Prototyping durable workflows without managing Temporal clusters

Poor fit:

Sub-second latency requirements (process startup adds overhead)
Polyglot teams that need Go or Python workers
Workflows with strict data residency rules (managed cloud may not comply)
Simple job queues where Bull or BullMQ suffice

Technical Verdict

Use Trigger.dev if you’re building TypeScript agent workflows that span hours and need automatic state replay without custom retry logic. The process-per-task isolation and managed Postgres persistence remove operational complexity. You get durable execution guarantees with familiar async/await syntax and zero infrastructure setup.

Avoid it if you require sub-second task dispatch (process startup adds 100-500ms overhead), need polyglot worker support beyond TypeScript, or have strict on-prem data residency requirements that managed cloud hosting cannot satisfy. For simple fire-and-forget jobs, Bull or BullMQ offer lower overhead. For complex polyglot orchestration with full control over state storage, Temporal remains the better choice despite its steeper learning curve.

The V2 pivot from Zapier-style automation to durable execution reflects real demand in the agent orchestration space. Trigger.dev delivers a pragmatic TypeScript-native implementation that prioritizes developer experience over configurability.