Trigger.dev V2: What a Temporal Alternative for TypeScript Reveals About Durable Execution Plumbing

Trigger.dev launched in February 2023 as a “developer-first Zapier alternative” (745 HN points). Eight months later, the team shipped V2 as a Temporal alternative for TypeScript (172 points). The pivot exposes a fundamental split in durable execution design: event-driven orchestration versus long-running task management.

The shift happened because users wanted something Zapier and Temporal both struggle to deliver: TypeScript-native workflows that survive crashes, retry intelligently, and don’t require learning a DSL or deploying a Go cluster.

Why the Pivot Matters

Temporal solves durable execution with event sourcing. Every workflow step emits an event. Replay reconstructs state. It works, but the architecture is Go-first, operationally heavy, and requires mental model shifts for developers used to writing async functions.

Trigger.dev V2 bets that TypeScript developers want:

Functions that look like normal async code
Automatic retries without explicit state machines
Observability that maps to their existing mental model
Deployment that doesn’t require Kubernetes expertise

The trade-off: you give up Temporal’s strict determinism guarantees and multi-language worker pools. You gain a faster dev loop and simpler operational surface.

State Persistence Model

Trigger.dev uses checkpoint-based persistence instead of event sourcing. When a task hits an await boundary, the runtime serializes the execution context and writes it to Postgres. If the worker crashes, the platform restarts the task from the last checkpoint.

Key differences from Temporal:

Aspect	Temporal	Trigger.dev V2
State model	Event sourcing, full replay	Checkpoint snapshots
Determinism	Strict (no random, no Date.now)	Relaxed (allows non-deterministic code)
Worker language	Polyglot (Go, TypeScript, Python)	TypeScript only
Replay cost	Grows with workflow length	Fixed per checkpoint
Version migration	Explicit workflow versioning	Code-level compatibility checks

The checkpoint model means you can use Math.random() or Date.now() in your tasks. Temporal forbids this because replay would produce different results. Trigger.dev snapshots the actual values, so non-deterministic code works fine.

The cost: if your task runs for hours and checkpoints every 30 seconds, you’re writing 120 snapshots to Postgres. Temporal writes one event per activity, which can be cheaper for workflows with sparse state changes.

Retry and Timeout Primitives

Trigger.dev exposes retry logic as task-level configuration, not workflow DSL constructs:

export const processVideo = task({
  id: "process-video",
  retry: {
    maxAttempts: 5,
    factor: 2,
    minTimeout: 1000,
    maxTimeout: 60000,
    randomize: true
  },
  run: async ({ videoUrl }: { videoUrl: string }) => {
    const frames = await extractFrames(videoUrl);
    const analyzed = await analyzeFrames(frames);
    return analyzed;
  }
});

If extractFrames throws, the entire task retries with exponential backoff. No explicit error handling required. The runtime tracks attempt count, last error, and backoff state in the checkpoint.

Timeout behavior:

Task-level timeout kills the entire execution
No sub-activity timeouts (you implement those yourself)
Heartbeat mechanism for long-running operations (send periodic signals to prove liveness)

Temporal separates workflow timeouts, activity timeouts, and schedule-to-start timeouts. Trigger.dev collapses these into task timeout and manual heartbeat calls. Simpler, but less granular control.

Isolation and Deployment Shape

Trigger.dev runs tasks in isolated Docker containers, one per task execution. The platform manages:

Container lifecycle (spin up, health check, teardown)
Checkpoint writes during execution
Log streaming to the dashboard
Automatic retries on container crashes

Deployment flow:

You write tasks in your TypeScript repo
Run trigger.dev deploy (builds Docker image, pushes to registry)
Platform pulls image when task triggers
Container starts, loads checkpoint if resuming, runs task code
Container exits, checkpoint persists

This is closer to AWS Lambda’s execution model than Temporal’s long-lived worker pools. Trade-offs:

Cold start latency (1-3 seconds for container spin-up)
No worker affinity (each retry might run on different hardware)
Easier horizontal scaling (no worker pool tuning)
Higher per-execution cost (container overhead)

Temporal workers are long-lived processes that poll for tasks. You manage scaling, but you avoid cold starts. Trigger.dev optimizes for developer ergonomics over raw throughput.

Versioning and In-Flight Migrations

When you deploy new task code while old executions are running, Trigger.dev uses semantic versioning to route traffic:

Tasks in-flight continue on the version they started with
New triggers use the latest deployed version
You can force-upgrade in-flight tasks (risky, but available)

The platform doesn’t enforce strict compatibility checks. If your new code changes the checkpoint schema, old tasks will fail on resume. You handle this with:

Defensive deserialization (check for missing fields)
Manual migration scripts (read old checkpoints, write new format)
Version pinning (keep old code deployed until tasks drain)

Temporal’s versioning is stricter. You use workflow.getVersion() to branch logic based on which code version started the workflow. More boilerplate, but safer for long-running workflows that span months.

Observability Hooks

The dashboard shows:

Task execution timeline (start, checkpoints, retries, completion)
Live logs streamed from container stdout
Trace spans for external API calls (if you instrument with OpenTelemetry)
Queue depth and concurrency limits

You can trigger tasks from the dashboard for testing. No need to replay production traffic or mock webhook payloads.

Missing compared to Temporal:

No built-in distributed tracing across workflows (you wire OpenTelemetry yourself)
No workflow history replay UI (checkpoints are opaque blobs)
Limited query capabilities (can’t inspect in-flight state without custom logging)

The observability model assumes you’re debugging individual task failures, not orchestrating complex multi-step workflows with branching logic.

When TypeScript-Native Matters

Trigger.dev’s value proposition is strongest when:

Your team writes TypeScript and doesn’t want to learn Temporal’s Go-flavored mental model
Tasks are measured in minutes or hours, not days or weeks
You need retries and durability but not strict determinism
You’re building AI agent loops, media processing pipelines, or scheduled jobs

It’s weaker when:

You need polyglot workers (Python ML models, Go data pipelines)
Workflows run for months and require complex versioning
You need sub-second latency (container cold starts hurt)
You’re already running Temporal and the operational cost is amortized

Architecture Comparison

Trigger.dev stack:

Postgres for checkpoint storage
Redis for queue management
Docker for task isolation
Node.js runtime (or Bun for 5x throughput, per their blog)
HTTP API for task triggers

Temporal stack:

Cassandra or Postgres for event history
gRPC for worker communication
Custom workflow engine in Go
Polyglot SDKs (TypeScript, Python, Java, Go)
Separate frontend service for UI

Trigger.dev is simpler to self-host (fewer moving parts) but less flexible for heterogeneous workloads.

Failure Modes

Checkpoint corruption:

If Postgres writes fail mid-checkpoint, the task restarts from the previous checkpoint. You lose progress since the last successful write. Temporal’s event log is append-only, so partial writes are less catastrophic.

Container crashes:

If the Docker daemon dies, in-flight tasks restart from the last checkpoint. No state loss, but you pay the retry cost. Temporal workers can crash without losing workflow state (it’s in the event history).

Queue backpressure:

If tasks trigger faster than containers can process them, the queue grows. Trigger.dev exposes concurrency limits per task type. You tune these manually. Temporal’s worker pools auto-scale based on task backlog (if you configure autoscaling).

Poison messages:

If a task fails all retries, it moves to a dead-letter queue. You inspect the failure, fix the code, and manually retry. Temporal has similar DLQ semantics, but you can also write compensating workflows to handle failures programmatically.

Code Snippet: AI Agent with Tool Calling

Trigger.dev’s model shines for AI agent loops. The platform handles retries, state persistence, and observability while you focus on tool orchestration:

export const researchAgent = task({
  id: "research-agent",
  retry: {
    maxAttempts: 3,
    factor: 2,
    minTimeout: 5000
  },
  run: async ({ topic }: { topic: string }) => {
    const messages: CoreMessage[] = [
      { role: "user", content: `Research: ${topic}` }
    ];

    for (let i = 0; i < 10; i++) {
      const { text, toolCalls, steps } = await generateText({
        model: anthropic("claude-opus-4-20250514"),
        system: "You are a research assistant with web access.",
        messages,
        tools: { search, browse, analyze },
        maxSteps: 5
      });

      // Checkpoint happens here automatically
      if (!toolCalls.length) {
        return { summary: text, stepsUsed: steps.length };
      }

      for (const call of toolCalls) {
        const result = await executeTool(call);
        messages.push({ role: "tool", content: result });
      }
    }

    throw new Error("Max iterations reached without conclusion");
  }
});

Each await boundary triggers a checkpoint. If the LLM API times out or the container crashes, the task resumes with the same message history. You don’t write explicit state management code.

Technical Verdict

Use Trigger.dev when:

You’re a TypeScript shop building AI agents, media pipelines, or scheduled jobs
You want durable execution without learning Temporal’s event sourcing model
Your tasks run for minutes to hours, not days to months
You value fast iteration over strict determinism guarantees

Avoid it when:

You need polyglot workers (Python ML models, Go services)
Workflows span weeks or months and require complex versioning
You need sub-second task latency (container cold starts add 1-3 seconds)
You’re already running Temporal and the operational cost is sunk

The platform represents a pragmatic middle ground: more durable than AWS Lambda, simpler than Temporal, optimized for the TypeScript ecosystem. The checkpoint model trades strict determinism for developer ergonomics. For AI agent loops and background jobs, that’s often the right trade-off.