Trigger.dev V2: What a Temporal Alternative for TypeScript Reveals About Durable Execution Plumbing

Trigger.dev launched in February 2023 as a “developer-first open source Zapier alternative” and pulled 745 points on Hacker News. Eight months later, the team shipped V2 with a completely different pitch: a Temporal alternative for TypeScript developers. That pivot, from webhook glue to durable execution primitives, teaches more about what developers building agents actually need than any feature list.

The shift happened because early users kept asking for the same thing: long-running background jobs that survive restarts, retries that don’t lose state, and resumability without manual checkpoint management. They didn’t want another event router. They wanted workflow orchestration that felt native to TypeScript.

What Changed Between V1 and V2

V1 focused on triggering actions from external events. You connected webhooks, scheduled cron jobs, and chained API calls. The architecture assumed short-lived executions and stateless handlers.

V2 rebuilt the core around durable execution. Tasks can run for hours or days. If a worker crashes, the task resumes from the last checkpoint. Retries preserve context. The platform manages state persistence, so you write business logic instead of recovery code.

The technical difference is state management. V1 treated each invocation as independent. V2 tracks execution history, stores intermediate results, and reconstructs runtime context after failures. That’s the gap between event-driven automation and workflow orchestration.

Durable Execution Without Go

Temporal is the reference implementation for durable execution, but it’s written in Go and requires running separate worker processes. Trigger.dev targets TypeScript developers who want the same guarantees without leaving their language or managing infrastructure.

The core primitives:

Task definitions wrap async functions with retry policies, timeout configs, and concurrency limits
Automatic checkpointing serializes state at await boundaries
Resumability reconstructs execution context from persisted history
Idempotency keys prevent duplicate work when retries overlap

Here’s what a durable task looks like:

export const processOrder = task({
  id: "process-order",
  retry: {
    maxAttempts: 5,
    factor: 2,
    minTimeout: 1000,
    maxTimeout: 60000,
  },
  run: async (payload: { orderId: string }) => {
    // This checkpoint survives worker restarts
    const order = await fetchOrder(payload.orderId);
    
    // Each await creates a resumption point
    const payment = await chargeCard(order.paymentMethod);
    
    // If this fails, retry starts here, not from the beginning
    const shipment = await createShipment(order.items);
    
    return { orderId: order.id, trackingNumber: shipment.tracking };
  },
});

The platform intercepts each await, stores the result, and marks a checkpoint. If the worker dies after chargeCard completes but before createShipment runs, the next execution skips the payment step and resumes at shipment creation.

State Persistence and Replay

Trigger.dev uses a Postgres-backed event log. Each task execution generates a sequence of events: TaskStarted, CheckpointCreated, StepCompleted, TaskFailed, TaskSucceeded. The worker reads this log to rebuild state.

When a task resumes:

Load the event log for the execution ID
Replay completed steps without re-executing side effects
Resume from the first incomplete step
Continue appending new events

This is similar to Temporal’s event sourcing model, but implemented in TypeScript with a simpler storage layer. The tradeoff is less flexibility for complex branching workflows, but faster onboarding for developers who don’t need full workflow DSLs.

Retry Logic and Failure Boundaries

Retries in durable execution systems have different semantics than simple HTTP retries. The system needs to distinguish between transient failures (network timeout, rate limit) and permanent failures (invalid input, resource not found).

Trigger.dev’s retry configuration:

Parameter	Purpose	Default
maxAttempts	Total retry count including initial attempt	3
factor	Exponential backoff multiplier	2
minTimeout	Initial retry delay in milliseconds	1000
maxTimeout	Maximum retry delay cap	60000
randomize	Add jitter to prevent thundering herd	true

The platform tracks which steps succeeded, so retries only re-run failed operations. If chargeCard succeeds but createShipment fails, the retry skips payment processing entirely.

Developers can also define custom retry logic:

retry: {
  maxAttempts: 10,
  factor: 1.5,
  shouldRetry: (error, attempt) => {
    if (error.code === 'INVALID_INPUT') return false;
    if (error.code === 'RATE_LIMIT' && attempt < 5) return true;
    return attempt < 3;
  },
}

Execution Isolation and Multi-Tenancy

Running user-submitted code safely requires strong isolation. Trigger.dev uses containerized workers with resource limits and network policies.

Each task runs in a separate container with:

CPU and memory quotas
Restricted filesystem access
Network egress controls
Environment variable injection for secrets

Secrets management uses a key-value store encrypted at rest. Workers fetch secrets at runtime using scoped tokens that expire after task completion. This prevents one tenant’s code from accessing another tenant’s credentials.

The platform also enforces concurrency limits per task and per organization. If you configure maxConcurrency: 5, the scheduler queues additional invocations until a slot opens. This prevents runaway costs and resource exhaustion.

Observability and Debugging

Durable execution creates new debugging challenges. A task might fail on the 47th retry after running for six hours. Traditional logs don’t capture enough context.

Trigger.dev’s observability stack:

Execution timeline shows each step, checkpoint, and retry with timestamps
State snapshots display variable values at each checkpoint
Trace propagation links tasks that trigger other tasks
Metrics dashboard tracks success rates, retry counts, and duration percentiles

The timeline view is critical. You can see exactly which step failed, what the input was, how many times it retried, and what changed between attempts. This beats grepping logs for correlation IDs.

TypeScript vs. Go Worker Models

Temporal’s Go-based workers offer more control over execution semantics. You can implement custom activity heartbeats, signal handlers, and workflow versioning strategies. The cost is operational complexity: you run worker pools, manage deployments, and handle version skew.

Trigger.dev’s TypeScript workers trade flexibility for simplicity. The platform manages worker lifecycle, scaling, and deployment. You write tasks as async functions and let the runtime handle orchestration.

The comparison:

Aspect	Temporal (Go)	Trigger.dev (TypeScript)
Worker deployment	Self-managed containers	Fully managed
State persistence	Cassandra or Postgres	Postgres event log
Retry semantics	Configurable per activity	Configurable per task
Language support	Go, Java, TypeScript, Python	TypeScript only
Workflow versioning	Manual version management	Automatic based on code hash
Execution guarantees	Exactly-once with determinism	At-least-once with idempotency

Temporal’s determinism requirement means workflows must be pure functions. Any non-deterministic operation (random number generation, external API calls) must happen in activities. Trigger.dev relaxes this constraint by using idempotency keys and replay detection instead of enforcing determinism.

When Durable Execution Matters for Agents

AI agents need durable execution because they make multiple LLM calls, wait for external tools, and handle unpredictable latencies. A research agent might:

Call an LLM to generate search queries (30 seconds)
Execute web searches in parallel (variable latency)
Call the LLM again to synthesize results (45 seconds)
Repeat for multiple iterations (minutes to hours)

If the worker crashes after step 2, you don’t want to re-run the expensive LLM calls. Durable execution preserves completed work and resumes from the last checkpoint.

The code example from Trigger.dev’s docs shows this pattern:

export const researchAgent = task({
  id: "research-agent",
  run: async ({ topic }: { topic: string }) => {
    const messages: CoreMessage[] = [
      { role: "user", content: `Research: ${topic}` },
    ];
    
    for (let i = 0; i < 10; i++) {
      const { text, toolCalls, steps } = await generateText({
        model: anthropic("claude-opus-4-20250514"),
        system: "You are a research assistant with web access.",
        messages,
        tools: { search, browse, analyze },
        maxSteps: 5,
      });
      
      if (!toolCalls.length) {
        return { summary: text, stepsUsed: steps.length };
      }
      
      for (const call of toolCalls) {
        const result = await executeTool(call);
        messages.push({ role: "tool", content: result });
      }
    }
  },
});

Each await generateText creates a checkpoint. If the task fails during tool execution, the retry resumes with the existing message history instead of starting over.

Deployment Shape and Scaling

Trigger.dev’s managed platform handles worker scaling automatically. When task volume increases, the system spawns additional containers. When load drops, it scales down.

For self-hosting, the architecture includes:

API server (Node.js) handles task submission and status queries
Scheduler (Node.js) manages task queues and concurrency limits
Workers (containerized Node.js) execute tasks
Postgres stores event logs and task metadata
Redis (optional) for distributed locking and caching

The self-hosted version requires more operational overhead but gives you control over data residency and resource allocation.

Likely Failure Modes

Durable execution systems fail in specific ways:

Poison messages: A task that always fails can block the queue. Trigger.dev mitigates this with dead-letter queues and configurable max attempts.

State bloat: Long-running tasks with many checkpoints accumulate large event logs. The platform prunes old events after task completion, but you can hit storage limits during execution.

Version skew: Deploying new task code while old executions are in-flight can cause deserialization errors. Trigger.dev uses code hashes to route executions to compatible worker versions.

Checkpoint overhead: Excessive checkpointing (awaiting inside tight loops) degrades performance. The runtime batches checkpoint writes, but you still pay serialization costs.

Idempotency violations: If your task has side effects that aren’t idempotent (incrementing counters, sending emails), retries can cause duplicates. You need to implement your own deduplication logic.

Technical Verdict

Use Trigger.dev when you need durable execution for TypeScript projects and don’t want to manage Temporal infrastructure. It’s a good fit for:

AI agent workflows with multiple LLM calls and tool invocations
Long-running data pipelines that process batches over hours
Background jobs that must survive deployments and restarts
Teams that want managed infrastructure with minimal ops overhead

Avoid it when:

You need sub-second latency (the checkpoint overhead adds milliseconds per step)
Your workflows require complex branching, parallel execution, or saga patterns (Temporal’s workflow DSL is more expressive)
You’re already running Temporal and have operational expertise
You need multi-language support (Trigger.dev is TypeScript only)

The V1-to-V2 pivot reveals what developers building agents actually need: not more webhook connectors, but primitives for managing state across long-running, failure-prone operations. Trigger.dev delivers that in a TypeScript-native package, trading Temporal’s flexibility for faster onboarding and lower operational complexity.