Trigger.dev started as a Zapier alternative (745 HN points in February 2023). By October, the team pivoted to V2: a Temporal alternative for TypeScript (172 points). The shift exposes a fundamental infrastructure question for agent builders: when do you need durable execution instead of event-driven webhooks?
The answer lies in state persistence, retry semantics, and how your code survives process crashes.
Why Event Triggers Are Not Enough
V1 Trigger.dev orchestrated external services through webhook chains. You connected APIs, defined triggers, and ran actions. This works for stateless glue code but breaks when tasks run longer than HTTP timeouts or require multi-step resumption.
Agent workflows need:
- State that survives restarts. If an LLM call times out after 90 seconds, you want to resume from the last tool call, not replay the entire conversation.
- Retry semantics beyond HTTP 5xx. Rate limits, transient API failures, and partial completions require backoff strategies tied to execution state, not just network errors.
- Observability across steps. You need to see which tool call failed in step 7 of a 15-step research loop, not just “workflow errored.”
Temporal solves this with event sourcing: every state transition is an immutable event. Workflows replay from event logs. The trade-off is operational complexity (running Temporal clusters) and a learning curve around deterministic execution constraints.
Trigger.dev V2 targets the same use case but optimizes for TypeScript developer ergonomics.
Durable Execution Primitives in TypeScript
Trigger.dev V2 introduces task(), a primitive that wraps long-running functions with automatic retries, state checkpointing, and observability hooks.
export const researchAgent = task({
id: "research-agent",
run: async ({ topic }: { topic: string }) => {
const messages: CoreMessage[] = [
{ role: "user", content: `Research: ${topic}` },
];
for (let i = 0; i < 10; i++) {
const { text, toolCalls, steps } = await generateText({
model: anthropic("claude-opus-4-20250514"),
system: "You are a research assistant with web access.",
messages,
tools: { search, browse, analyze },
maxSteps: 5,
});
if (!toolCalls.length) {
return { summary: text, stepsUsed: steps.length };
}
for (const call of toolCalls) {
const result = await executeTool(call);
messages.push({ role: "tool", content: result });
}
}
},
});
Key plumbing details:
- Automatic checkpointing. After each
await, Trigger.dev persists execution state. If the process crashes, the task resumes from the last checkpoint, not the function start. - Retry configuration. You define retry policies (exponential backoff, max attempts, jitter) at the task level. Failures trigger retries without re-executing completed steps.
- Observability by default. Each task execution generates trace spans. You see tool call latency, retry attempts, and failure reasons in the dashboard without instrumenting your code.
This differs from queue-based approaches (BullMQ, Celery) where you manually manage state serialization and idempotency. It also differs from Temporal, where deterministic execution rules prevent non-deterministic operations (random number generation, Date.now()) inside workflows.
State Persistence: Event Sourcing vs. Snapshot Checkpoints
Temporal rebuilds workflow state by replaying all events from the start. This guarantees correctness but requires strict determinism. You cannot call external APIs directly inside a Temporal workflow; you must use activities (separate functions with their own retry logic).
Trigger.dev uses snapshot checkpoints. After each async boundary, it serializes the current execution context (variables, call stack position, pending promises) and stores it. On resume, it deserializes the snapshot and continues.
| Aspect | Temporal (Event Sourcing) | Trigger.dev (Snapshots) |
|---|---|---|
| State rebuild | Replay all events from start | Deserialize last checkpoint |
| Determinism requirement | Strict (no Date.now(), Math.random()) | Relaxed (checkpoints capture state) |
| External API calls | Must use activities | Allowed in task body |
| Replay cost | Grows with workflow history | Constant (one snapshot load) |
| Debugging | Full event history | Checkpoint + logs |
For agent workflows with dozens of LLM calls and tool invocations, snapshot checkpoints reduce replay overhead. You pay with less granular history (you see checkpoints, not every state transition).
Retry Semantics and Failure Modes
Trigger.dev exposes three retry knobs:
- Task-level retries. If the entire task fails (unhandled exception, timeout), retry the whole function with exponential backoff.
- Step-level retries. Wrap individual operations (API calls, database writes) in retry logic without restarting the task.
- Manual checkpoints. Call
checkpoint()to force a state snapshot before risky operations.
Example failure mode: an agent calls a rate-limited API in step 5 of 10. With task-level retries, you re-execute steps 1-4 (wasted LLM tokens). With step-level retries, you retry only the API call. With manual checkpoints, you snapshot before step 5, so retries resume from there.
Temporal handles this with activities: each activity has independent retry policies. The workflow orchestrates activities but does not execute them. Trigger.dev collapses this into a single task abstraction, trading separation of concerns for simpler code.
Deployment Shape and Observability
Trigger.dev V2 runs as a hosted platform or self-hosted Docker container. The architecture:
- Task executor. Runs your TypeScript code in isolated Node.js processes. Checkpoints state to Postgres or Redis.
- Scheduler. Manages task queues, retry timers, and concurrency limits.
- Dashboard. Visualizes task traces, retry attempts, and failure logs.
For self-hosting, you deploy:
services:
trigger:
image: trigger.dev/v2:latest
environment:
DATABASE_URL: postgres://...
REDIS_URL: redis://...
ports:
- "3000:3000"
Observability hooks into OpenTelemetry. Each task execution emits spans for:
- Task start/end
- Checkpoint writes
- Retry attempts
- Tool calls (if instrumented)
You export traces to Datadog, Honeycomb, or Grafana. This differs from Temporal’s built-in UI, which requires running the Temporal Web service.
When to Use Trigger.dev vs. Temporal
Use Trigger.dev when:
- You write TypeScript and want minimal operational overhead.
- Your workflows call external APIs frequently (LLM providers, search APIs, databases).
- You need fast iteration without learning event sourcing constraints.
- You run on a hosted platform or simple Docker setup.
Use Temporal when:
- You need strict correctness guarantees (financial transactions, compliance workflows).
- You run polyglot services (Go workers, Python activities, Java orchestrators).
- You already operate Kubernetes clusters and can absorb Temporal’s operational cost.
- You want full event history for auditing or debugging.
Avoid both when:
- Your tasks complete in under 30 seconds (use a simple queue like BullMQ).
- You do not need retries or state persistence (use HTTP endpoints).
- You want serverless execution without managing infrastructure (use Step Functions or Inngest).
Trade-Offs for Agent Workflows
Agent orchestration stresses durable execution engines in specific ways:
- Non-deterministic LLM outputs. Each retry may produce different tool calls. Snapshot checkpoints handle this naturally; event sourcing requires careful activity design.
- Long-running research loops. Agents may run for minutes or hours. Checkpoint overhead matters less than replay cost.
- Tool call observability. You need to see which tool failed and why. Trace spans per tool call beat replaying event logs.
Trigger.dev’s snapshot model fits these constraints better than Temporal’s strict determinism. The cost is less granular history and weaker guarantees around exactly-once execution (checkpoints can fail mid-write).
Technical Verdict
Trigger.dev V2 is a pragmatic durable execution engine for TypeScript teams building agent workflows. It trades Temporal’s correctness guarantees for simpler developer ergonomics and lower operational overhead.
Use it when you need resumable workflows with automatic retries but do not want to run a Temporal cluster or learn event sourcing constraints. Avoid it when you need strict exactly-once semantics, polyglot workers, or full event history for compliance.
The pivot from event triggers (V1) to durable execution (V2) reflects a broader shift: agent infrastructure needs workflows that survive failures, not just webhook chains. Snapshot checkpoints and TypeScript-native primitives make this accessible without distributed systems expertise.
Source Links
- Trigger.dev V2 Announcement: https://news.ycombinator.com/item?id=37750763
- Trigger.dev Documentation: https://trigger.dev
- Original V1 Show HN (Zapier Alternative): https://news.ycombinator.com/item?id=34610686