Trigger.dev V2: What a Temporal Alternative for TypeScript Reveals About Durable Execution Plumbing

Trigger.dev started as a webhook-driven Zapier alternative in February 2023 (745 HN points). By October 2023, the team shipped V2 as a Temporal competitor for TypeScript developers (172 points). The pivot exposes a gap in the durable execution market: most agent builders need retries, timeouts, and state persistence without running Go or Java infrastructure.

The shift from webhook automation to durable execution primitives reveals architectural choices that matter when you build multi-step agent workflows. Here is what the plumbing looks like.

Why Webhook Automation Is Not Durable Execution

V1 Trigger.dev connected APIs via webhooks. You defined triggers (GitHub push, Stripe payment) and actions (send Slack message, update database). The runtime handled HTTP retries and basic error handling.

V2 introduced task primitives that survive process crashes:

Sleep and delay: pause execution for hours or days without holding a connection
Automatic retries: exponential backoff with configurable limits
Timeouts: per-step and global execution boundaries
Fan-out/fan-in: parallel task execution with result aggregation
Human-in-the-loop pauses: wait for external approval before continuing

These primitives require state checkpointing. A webhook handler restarts from scratch on failure. A durable task resumes from the last checkpoint.

State Persistence Model

Trigger.dev uses a database-backed state machine instead of Temporal’s event sourcing. Each task execution writes state snapshots to Postgres at defined checkpoints:

Before and after each await boundary
Before and after tool calls in agent loops
At explicit checkpoint() calls

On failure, the runtime replays from the last snapshot. This avoids Temporal’s full event log replay but requires developers to understand checkpoint boundaries.

Key difference: Temporal replays your code from the beginning using deterministic execution. Trigger.dev restores state from the database and continues forward. This makes debugging easier (you see actual state, not replayed history) but requires idempotent operations between checkpoints.

Retry Semantics and Failure Recovery

Trigger.dev exposes retry policies at the task level:

export const researchAgent = task({
  id: "research-agent",
  retry: {
    maxAttempts: 3,
    factor: 2,
    minTimeout: 1000,
    maxTimeout: 10000,
  },
  run: async ({ topic }: { topic: string }) => {
    // Task logic with automatic retries
  },
});

Failures trigger retries with exponential backoff. The runtime tracks attempt count in the database. If all retries fail, the task moves to a dead-letter queue.

Replay behavior: On retry, the runtime loads the last checkpoint and re-executes from that point. Side effects before the checkpoint do not repeat. Side effects after the checkpoint may repeat unless you add idempotency keys.

Agent Loop Architecture

The example code shows a 10-iteration agent loop with tool calls. Each iteration checkpoints before and after generateText():

for (let i = 0; i < 10; i++) {
  const { text, toolCalls, steps } = await generateText({
    model: anthropic("claude-opus-4-20250514"),
    tools: { search, browse, analyze },
    maxSteps: 5,
  });

  if (!toolCalls.length) {
    return { summary: text, stepsUsed: steps.length };
  }

  for (const call of toolCalls) {
    const result = await executeTool(call);
    messages.push({ role: "tool", content: result });
  }
}

If the task crashes during iteration 7, the runtime resumes at iteration 7 with the messages array restored from the checkpoint. The LLM call repeats, which may produce different tool calls due to non-determinism.

Observability: The dashboard shows each checkpoint, tool call, and retry. You can inspect the messages array at any point in the loop. This beats reading Temporal event logs for debugging agent behavior.

Deployment Models and Worker Architecture

Trigger.dev offers three deployment options:

Model	Worker Location	State Storage	Use Case
Cloud	Managed containers	Trigger.dev Postgres	Fast onboarding, no ops
Hybrid	Your infrastructure	Trigger.dev Postgres	Data locality, custom compute
Self-hosted	Your infrastructure	Your Postgres	Full control, air-gapped environments

Cloud deployment runs workers in ephemeral containers with automatic scaling. Hybrid keeps workers in your VPC but uses Trigger.dev for state and orchestration. Self-hosted gives you the full stack.

Worker lifecycle: Workers poll the Trigger.dev API for tasks. When a task arrives, the worker loads the checkpoint, executes the next step, writes a new checkpoint, and returns the result. This differs from Temporal’s long-lived worker pools.

Concurrency and Queue Control

Trigger.dev exposes concurrency limits at the task and queue level:

Task-level: limit concurrent executions of a single task
Queue-level: limit concurrent executions across multiple tasks in a queue
Global: account-wide concurrency caps

Queues use priority ordering. High-priority tasks jump the queue. This matters for agent workflows where user-facing tasks need faster response than background research.

Failure mode: If workers crash, tasks stay in the queue. New workers pick them up and resume from the last checkpoint. If the database goes down, all execution stops. There is no local state fallback.

Comparison to LangGraph and Inngest

Feature	Trigger.dev	LangGraph	Inngest
State model	Database snapshots	In-memory graph	Event log
Retry logic	Exponential backoff	Manual checkpoints	Automatic with steps
Observability	Dashboard + traces	Graph visualization	Event timeline
Deployment	Managed or self-hosted	Bring your own runtime	Managed cloud
TypeScript-native	Yes	Yes	Yes

LangGraph gives you graph-based state machines with explicit edges between nodes. You control checkpointing and branching. Trigger.dev abstracts the graph into task primitives with automatic checkpointing.

Inngest uses event-driven steps. Each step is a separate function invocation. Trigger.dev keeps the entire task in one function with checkpoints at await boundaries.

When to use which: Use LangGraph for complex agent graphs with conditional branching. Use Inngest for event-driven microservices. Use Trigger.dev for long-running tasks with retries and timeouts in a single function.

Security Boundaries and Secrets Management

Trigger.dev stores secrets in environment variables encrypted at rest. Workers decrypt secrets at runtime. There is no secret rotation API yet.

Isolation: Each task runs in a separate container (cloud) or process (self-hosted). Tasks cannot access each other’s memory or file systems. Database-level isolation prevents tasks from reading other tasks’ checkpoints.

Audit trail: The dashboard logs every task execution, checkpoint, and retry. You can trace which secrets were accessed during each step. This helps with compliance audits.

Observability and Debugging

The dashboard shows:

Real-time task execution status
Checkpoint history with state snapshots
Retry attempts and failure reasons
Tool call inputs and outputs
Execution timeline with step durations

You can replay a failed task from any checkpoint. This beats reading logs or event streams. The tracing view shows parent-child relationships for fan-out tasks.

Missing features: No distributed tracing integration (OpenTelemetry support is on the roadmap). No custom metrics API. You cannot export traces to Datadog or Honeycomb yet.

Likely Failure Modes

Database bottleneck: Every checkpoint writes to Postgres. High-frequency tasks (sub-second iterations) can saturate the database. The team recommends batching checkpoints or using longer sleep intervals.

Non-deterministic replays: If your task reads from an external API without caching, retries may produce different results. Add idempotency keys or cache responses between checkpoints.

Worker starvation: If all workers are busy with long-running tasks, new tasks queue up. Set concurrency limits and use priority queues to prevent starvation.

Checkpoint bloat: Large state objects (multi-megabyte messages arrays) slow down checkpoint writes and reads. Compress or paginate state between checkpoints.

Technical Verdict

Use Trigger.dev when:

You need durable execution in TypeScript without running Temporal infrastructure
Your agent workflows have 10-1000 steps with retries and timeouts
You want a managed platform with automatic scaling and observability
You can tolerate database-backed state instead of in-memory execution

Avoid Trigger.dev when:

You need sub-second task latency (checkpoints add 50-200ms overhead)
Your workflows require complex branching logic better suited to graph-based orchestration
You already run Temporal and need cross-language workflow support
You need air-gapped deployment without internet access to the control plane (unless you self-host)

The V2 pivot shows what happens when you listen to users building real agent workflows. The shift from webhooks to durable execution primitives exposes the plumbing that matters: state checkpointing, retry semantics, and deployment flexibility. For TypeScript teams building multi-step agent tasks, Trigger.dev fills a gap between simple job queues and heavyweight workflow engines.