mech.app
Automation

Trigger.dev V2: How a TypeScript Workflow Engine Competes with Temporal Without Go or gRPC

Architecture comparison of Trigger.dev's durable execution model: TypeScript-native orchestration, retry semantics, and state persistence without Tempor...

Source: trigger.dev
Trigger.dev V2: How a TypeScript Workflow Engine Competes with Temporal Without Go or gRPC

Trigger.dev started as a “developer-first Zapier alternative” in February 2023 (745 HN points). By October, the team pivoted hard: V2 became a “Temporal alternative for TypeScript devs” (172 points). The repositioning matters because it signals a shift from no-code automation to durable execution for agent workflows. The question is whether you can build reliable, long-running orchestration in TypeScript without adopting Temporal’s Go runtime, gRPC protocol, or polyglot infrastructure.

Why the Pivot Happened

The V1 Show HN discussion revealed a mismatch between what the team built and what developers wanted. Users asked for durable task execution, not webhook-triggered automations. They needed workflows that survive crashes, handle retries intelligently, and persist state across multi-day operations. The feedback loop pushed Trigger.dev toward the orchestration layer: competing with Temporal and Inngest rather than Zapier and n8n.

The V2 announcement thread shows the team responding to specific pain points: developers wanted to write orchestration logic in TypeScript without deploying a separate workflow engine cluster. They wanted observability without learning a new query language. They wanted retries and timeouts without manually wiring Redis queues.

The Temporal Problem for TypeScript Teams

Temporal gives you durable execution: workflows survive process crashes, network partitions, and multi-day pauses. But the architecture assumes you run a separate cluster (Go binaries), communicate over gRPC, and maintain workflow/activity boundaries in your application code. For teams already running Node.js services, this means:

  • A new runtime to deploy and monitor (Temporal server cluster)
  • gRPC client libraries and protobuf schemas
  • Separate worker processes that poll for tasks
  • Workflow code that can’t directly call external APIs (activities only)

Trigger.dev’s bet is that most TypeScript teams want durable execution semantics without the operational overhead of a second runtime. The trade-off is performance and scale ceiling in exchange for a single-language stack.

Trigger.dev’s Execution Model

Trigger.dev runs tasks as isolated functions with built-in retry, timeout, and state persistence. The platform handles durability by checkpointing execution state after each step. If a task crashes or times out, the engine replays from the last checkpoint instead of restarting from scratch.

Core Primitives

  • Tasks: TypeScript functions wrapped in task() that return a run handler
  • Steps: Atomic units of work inside a task (API calls, database writes, tool invocations)
  • Checkpoints: Automatic state snapshots after each step completes
  • Retries: Configurable per-task or per-step with exponential backoff
  • Queues: Named FIFO or priority queues with concurrency limits

The runtime doesn’t require a workflow definition language. You write normal TypeScript with async/await. The platform instruments your code to detect step boundaries and persist intermediate results.

export const researchAgent = task({
  id: "research-agent",
  retry: { maxAttempts: 3, backoff: "exponential" },
  run: async ({ topic }: { topic: string }) => {
    // Step 1: Search (automatically checkpointed)
    const searchResults = await search.run({ query: topic });
    
    // Step 2: Analyze (resumes here if step 1 succeeded but step 2 failed)
    const analysis = await analyze.run({ data: searchResults });
    
    // Step 3: Generate report (resumes here if steps 1-2 succeeded)
    const report = await generateReport.run({ analysis });
    
    return { summary: report, sources: searchResults.length };
  }
});

If analyze.run() throws an error, the platform retries from that step without re-running search.run(). The checkpoint includes the return value of searchResults.

State Persistence and Replay

The Trigger.dev documentation describes a checkpoint-based persistence model. Each step completion writes execution state to durable storage. On replay, the engine loads the last successful checkpoint, skips already-completed steps (returns cached outputs), and resumes execution at the first failed or pending step.

This differs from Temporal’s event sourcing model. Event sourcing replays workflows from the beginning using a deterministic event log: every decision, timer, and activity result is recorded as an immutable event. The workflow replays all events to reconstruct state. Trigger.dev’s approach is simpler but requires careful serialization of step outputs. If your step returns a non-serializable object (like a database connection), replay breaks.

Serialization Constraints

  • Step inputs and outputs must be JSON-serializable
  • No closures over external state (each step runs in a fresh context)
  • No direct database connections or file handles across steps

These constraints mirror Temporal’s workflow restrictions but apply at the step level instead of the workflow/activity boundary.

Long-Running Tasks and Timeouts

Trigger.dev supports tasks that run for hours or days. The platform uses a heartbeat mechanism to detect stalled workers. Each task sends periodic heartbeats. If no heartbeat arrives within the configured interval, the task is marked as failed and the retry policy determines whether to restart or mark as permanently failed.

For truly long-running tasks (multi-day AI training, batch processing), you can structure your code to checkpoint frequently:

export const batchProcessor = task({
  id: "batch-processor",
  run: async ({ items }: { items: string[] }) => {
    for (const item of items) {
      // Each step creates an implicit checkpoint
      await processItem.run({ item });
    }
  }
});

Each call to processItem.run() creates a checkpoint boundary. If the task crashes mid-loop, it resumes at the next item instead of reprocessing the entire batch.

Concurrency and Queue Management

Trigger.dev uses named queues with configurable concurrency limits. This prevents resource exhaustion when running hundreds of parallel tasks:

export const scrapeWebsite = task({
  id: "scrape-website",
  queue: { name: "scraping", concurrency: 10 },
  run: async ({ url }: { url: string }) => {
    // Only 10 scraping tasks run concurrently
    const html = await fetch(url).then(r => r.text());
    return { url, length: html.length };
  }
});

Queue behavior:

  • FIFO by default (oldest task runs first)
  • Priority queues available (higher priority tasks jump the queue)
  • Per-queue concurrency limits (global across all workers)
  • Dead-letter queues for permanently failed tasks

This is similar to Temporal’s task queues but without the separate worker pool architecture. Trigger.dev workers pull from all queues and respect global concurrency limits.

Observability and Debugging

The platform provides a web UI for inspecting task runs:

  • Real-time logs streamed from running tasks
  • Step-by-step execution timeline with durations
  • Retry history and error stack traces
  • Input/output inspection for each step

For programmatic access, Trigger.dev exposes a REST API and webhooks:

  • Query task status by ID or filter by queue/status
  • Subscribe to task completion events
  • Trigger manual retries or cancellations

The observability model is simpler than Temporal’s because there’s no workflow/activity split. Every step appears in a single timeline instead of nested activity traces.

Deployment and Scaling

Trigger.dev runs as a managed service (cloud) or self-hosted (Docker/Kubernetes). The self-hosted architecture consists of:

  • API server: Handles task submissions and status queries
  • Worker pool: Executes tasks and sends heartbeats
  • Database: Stores task state and checkpoints
  • Optional cache layer: Speeds up queue polling

For self-hosting, you deploy workers that connect to the Trigger.dev API:

# docker-compose.yml
services:
  trigger-api:
    image: triggerdotdev/trigger:latest
    environment:
      DATABASE_URL: postgres://...
    ports:
      - "3000:3000"
  
  trigger-worker:
    image: triggerdotdev/trigger:latest
    command: worker
    environment:
      DATABASE_URL: postgres://...
    deploy:
      replicas: 5

Workers are stateless and scale horizontally. The platform uses locking mechanisms to prevent duplicate task execution when multiple workers poll the same queue.

Scaling Considerations

ComponentConsiderationNotes
Task throughputDatabase write capacityCheckpoint writes add latency
Checkpoint sizeSerialization overheadLarge payloads slow replay
Task durationHeartbeat reliabilityLong tasks need stable connections
Queue depthDatabase query performanceDeep queues require indexing

The database dependency becomes a bottleneck at high scale. Temporal’s event sourcing model handles millions of concurrent workflows because it uses a distributed log (Cassandra or Elasticsearch). Trigger.dev trades that scale ceiling for simpler deployment.

Failure Modes and Edge Cases

Checkpoint Corruption

If a step writes partial state before crashing, the checkpoint may contain inconsistent data. Trigger.dev doesn’t detect this automatically. You need to design steps to be idempotent:

// Bad: non-idempotent step
await updateDatabase.run({ userId, increment: 1 });

// Good: idempotent step
await updateDatabase.run({ userId, setValue: currentValue + 1 });

Non-Deterministic Code

If your task uses Date.now() or Math.random() across steps, replay produces different results. Trigger.dev doesn’t enforce determinism like Temporal does. You must manually hoist non-deterministic values to task inputs:

export const scheduleEmail = task({
  id: "schedule-email",
  run: async ({ userId, timestamp }: { userId: string, timestamp: number }) => {
    // Use timestamp from input, not Date.now()
    const delay = timestamp - Date.now();
    await sleep(delay);
    await sendEmail.run({ userId });
  }
});

Worker Crashes During Checkpoint Write

If a worker crashes while writing a checkpoint, the transaction may roll back or leave partial state. The task retries from the previous checkpoint. This is safe but may duplicate side effects if the step isn’t idempotent.

When to Use Trigger.dev vs. Temporal

ScenarioTrigger.devTemporal
TypeScript-only stackStrong fitRequires gRPC clients
Small to medium task volumeStrong fitOverkill
Multi-day workflowsWorks with checkpointsNative support
Polyglot teams (Go, Java, Python)Weak fitStrong fit
Need low-latency executionWeak fit (checkpoint overhead)Strong fit (in-memory state)
Existing Kubernetes clusterSelf-host easilyRequires Temporal cluster

Trigger.dev makes sense when you want durable execution without adding a second runtime. It’s a good fit for AI agent workflows, batch processing, and scheduled tasks where TypeScript is already your primary language.

Avoid Trigger.dev if you need:

  • Very low-latency task execution (checkpoint writes add measurable overhead)
  • Extremely high concurrent workflow counts (database becomes a bottleneck)
  • Polyglot execution (Temporal supports Go, Java, Python, PHP, .NET)

Technical Verdict

Trigger.dev delivers durable execution semantics in a TypeScript-native package. The checkpoint-based replay model is simpler than Temporal’s event sourcing (where workflows replay from the beginning using an immutable event log) but imposes serialization constraints and a lower scale ceiling. For teams running Node.js services who need reliable orchestration without deploying a Go cluster, it’s a pragmatic choice. The trade-off is operational simplicity versus raw throughput and scale.

Use Trigger.dev when your orchestration needs fit within a single-language stack and you value operational simplicity over maximum performance. Avoid it if you need extremely low latency, polyglot execution, or millions of concurrent workflows.