Trigger.dev started as a TypeScript-native Zapier alternative (745 points, Feb 2023), then pivoted to durable execution after user feedback revealed what developers actually needed: workflows that survive crashes, retry intelligently, and orchestrate long-running tasks without manual checkpointing. The V2 announcement (172 points, Oct 2023) positioned it explicitly as a Temporal alternative for TypeScript teams.
The shift exposes a fundamental split in workflow orchestration. Event-driven systems like Zapier chain discrete steps with webhook glue. Durable execution systems like Temporal and Trigger.dev persist workflow state across process restarts, handle retries at the framework level, and let you write imperative code that looks synchronous but survives infrastructure failures.
State Persistence Without Checkpointing Code
Trigger.dev persists workflow state by intercepting async operations and logging them to a durable event log. When a task crashes mid-execution, the runtime replays the log on restart and reconstructs in-memory state up to the failure point.
How it works:
- Every
awaitboundary in your task becomes a checkpoint - The runtime logs each async call (HTTP request, database query, tool invocation) with its result
- On replay, logged results are returned immediately without re-executing side effects
- Only new operations after the last checkpoint actually run
This means you write normal TypeScript async/await code. The framework handles durability:
export const processOrder = task({
id: "process-order",
run: async ({ orderId }: { orderId: string }) => {
// Each await is a checkpoint
const order = await db.orders.findUnique({ where: { id: orderId } });
// If crash happens here, replay starts from this point
const payment = await stripe.charges.create({
amount: order.total,
currency: "usd",
});
// State persisted across this boundary too
await db.orders.update({
where: { id: orderId },
data: { paymentId: payment.id, status: "paid" },
});
return { orderId, paymentId: payment.id };
},
});
If the process dies after the Stripe charge but before the database update, replay will skip the Stripe call (using the logged result) and proceed directly to the update.
Retry Primitives vs. Temporal’s Activity Model
Temporal separates workflows (orchestration logic) from activities (side-effecting operations). Activities get explicit retry policies, heartbeats for long-running work, and timeout controls. Workflows are deterministic replay machines.
Trigger.dev collapses this distinction. Everything is a task. Retries happen at the task level with exponential backoff:
export const flakyScrape = task({
id: "scrape-with-retries",
retry: {
maxAttempts: 5,
factor: 2,
minTimeoutInMs: 1000,
maxTimeoutInMs: 60000,
},
run: async ({ url }: { url: string }) => {
const page = await browser.newPage();
const content = await page.goto(url); // Retries if this throws
return content;
},
});
Key differences:
| Feature | Temporal | Trigger.dev |
|---|---|---|
| Retry scope | Per-activity, configurable | Per-task, exponential backoff |
| Heartbeats | Explicit activity heartbeats for progress tracking | No native heartbeat mechanism |
| Timeouts | Separate schedule-to-close, start-to-close, heartbeat | Single task timeout |
| Determinism | Strict: workflows must be deterministic | Relaxed: tasks can have side effects |
| Versioning | Workflow versioning with compatibility checks | Task versioning via id changes |
Temporal’s activity heartbeats let long-running work signal progress and detect worker failures. Trigger.dev expects tasks to complete or fail within their timeout window. For multi-hour operations, you’d split into smaller tasks or poll external state.
The relaxed determinism in Trigger.dev is intentional for agent workflows where tool calls naturally have side effects. Agent orchestration often involves calling external APIs, invoking LLMs, or interacting with databases in ways that cannot be purely deterministic. Trigger.dev optimizes for developer velocity in these scenarios, accepting the trade-off that replay behavior may differ slightly from the original execution if non-deterministic operations are involved. This flexibility accelerates development but requires discipline in financial or compliance contexts where strict determinism prevents subtle replay bugs.
TypeScript Async/Await Mapping to Durable Semantics
When a Promise rejects mid-workflow, Trigger.dev logs the error and triggers retry logic. The replay mechanism ensures idempotency for operations that succeeded before the failure.
Promise rejection handling:
- Task throws or rejects a Promise
- Runtime logs the failure with stack trace
- Retry policy determines if another attempt happens
- On retry, replay skips successful checkpoints and re-executes from failure point
Concrete example of replay behavior:
import Stripe from "stripe";
const stripe = new Stripe(process.env.STRIPE_KEY);
export const multiStepTask = task({
id: "multi-step-example",
retry: { maxAttempts: 3 },
run: async ({ userId }: { userId: string }) => {
// Step 1: Fetch user (succeeds)
const user = await db.users.findUnique({ where: { id: userId } });
console.log("Fetched user:", user.email);
// Step 2: Call external API (throws on first attempt)
const charge = await stripe.charges.create({
amount: 1000,
currency: "usd",
source: user.stripeToken,
});
console.log("Charge result:", charge.id);
// Step 3: Update database
await db.users.update({
where: { id: userId },
data: { processed: true },
});
return { success: true };
},
});
What happens on failure and replay:
- First execution: Step 1 succeeds, logs user fetch. Step 2 throws
StripeCardError. Runtime logs the error and schedules retry. - Second execution (replay): Step 1 returns cached user from log without hitting database. Console log does NOT print again. Step 2 re-executes the Stripe call. If it succeeds, Step 3 runs fresh.
- Key insight: Side effects before the failure point (console logs, database reads) are not re-executed. Only the failed operation and everything after it runs again.
Non-determinism risks:
Trigger.dev allows side effects in task code (database writes, API calls). This differs from Temporal’s strict determinism requirement. If you call Math.random() or Date.now() in a task, replay will produce different values.
Concrete failure scenario: If you generate a random order ID using Math.random() and insert it into a database, replay will generate a different ID. If the original execution inserted the first ID before crashing, replay inserts a second ID, creating duplicate records. The database now has two orders for the same user action.
Workaround: use the task’s context for deterministic values:
export const scheduledReport = task({
id: "daily-report",
run: async (payload, { ctx }) => {
// ctx.run.startedAt is deterministic across replays
const reportDate = ctx.run.startedAt;
// Don't use Date.now() directly
const data = await fetchData(reportDate);
return generateReport(data);
},
});
Deployment and Scaling Shape
Trigger.dev workers:
- Run as long-lived Node.js processes
- Poll the Trigger.dev platform for task assignments
- Auto-scale based on queue depth (managed service) or manual scaling (self-hosted)
- Each worker can handle multiple concurrent tasks up to memory limits
Temporal workers:
- Run as Go or TypeScript processes (SDK-dependent)
- Poll Temporal server for workflow and activity tasks
- Require separate worker pools for workflows vs. activities
- Horizontal scaling via worker count, vertical via task slots per worker
Infrastructure comparison:
| Component | Trigger.dev | Temporal |
|---|---|---|
| Orchestration server | Managed SaaS or self-hosted API | Self-hosted Temporal server cluster (requires 3+ nodes for HA) |
| State storage | PostgreSQL (managed) or your DB | Cassandra, PostgreSQL, MySQL |
| Worker runtime | Node.js/Bun | Go, Java, TypeScript, Python |
| Observability | Built-in dashboard | Temporal Web UI + custom metrics |
| Deployment | Docker container or serverless | Kubernetes, VMs, or containers |
| Operational burden | Zero cluster ops (managed) or single-service deployment (self-hosted) | Requires cluster management, monitoring, and database tuning |
Trigger.dev’s managed service handles the orchestration layer. You deploy workers as containers or serverless functions. Temporal requires running the server cluster yourself (or using Temporal Cloud), plus managing worker deployments. For teams with existing Kubernetes infrastructure, Temporal’s container-native design reduces operational overhead. Trigger.dev’s managed service eliminates cluster management but adds vendor lock-in risk.
Versioning Long-Running Workflows
When workflow code changes while instances are still executing, you need a versioning strategy.
Trigger.dev approach:
- Task
idacts as the version identifier - Changing the task code without changing
idapplies to new runs only - Running tasks continue with the code version they started with
- To force migration, create a new task with a new
idand trigger it from the old one
Temporal approach:
- Workflow type name + version number
- Supports patching: conditional logic based on whether a workflow started before or after a code change
- Worker can run multiple versions simultaneously
- More complex but handles gradual migration
Example of Trigger.dev task evolution:
// V1: original task
export const processUserV1 = task({
id: "process-user-v1",
run: async ({ userId }) => {
const user = await db.users.findUnique({ where: { id: userId } });
await sendEmail(user.email, "Welcome!");
},
});
// V2: added validation step
export const processUserV2 = task({
id: "process-user-v2",
run: async ({ userId }) => {
const user = await db.users.findUnique({ where: { id: userId } });
// New validation logic
if (!user.emailVerified) {
throw new Error("Email not verified");
}
await sendEmail(user.email, "Welcome!");
},
});
Running V1 tasks complete with old logic. New triggers use V2. No automatic migration path.
Observability and Failure Modes
Trigger.dev provides a dashboard showing:
- Task execution timeline with checkpoint visibility
- Retry attempts and backoff progression
- Logs from each task run
- Queue depth and worker utilization
Dashboard comparison:
| Feature | Trigger.dev Dashboard | Temporal Web UI |
|---|---|---|
| Checkpoint visibility | Shows each await boundary with timestamp | Shows activity boundaries and decisions |
| Retry timeline | Visual timeline with backoff intervals | Activity retry history with attempt details |
| State inspection | JSON view of logged results per checkpoint | Workflow history events with payloads |
| Live logs | Streamed console output from tasks | Requires custom logging integration |
| Queue metrics | Built-in queue depth and worker utilization | Requires Prometheus + custom dashboards |
Trigger.dev’s dashboard is opinionated and batteries-included. Temporal Web UI is powerful but requires more instrumentation for production observability. Use Trigger.dev’s dashboard for quick debugging and real-time monitoring. Use Temporal’s event history for forensic replay analysis and deep inspection of workflow decision paths.
Common failure modes:
- Worker crashes mid-task: Replay from last checkpoint when worker restarts
- Database unavailable during checkpoint: Task retries with backoff
- Non-deterministic code on replay: Different execution path, potential state corruption
- Memory exhaustion from large state: Task fails, requires splitting into smaller tasks
- Version mismatch during replay: Running task continues with original code version
Monitoring strategy:
- Track retry rates per task type
- Alert on tasks exceeding max attempts
- Monitor checkpoint frequency (too many = performance hit, too few = large replay windows)
- Watch for non-determinism by comparing replay logs to original execution
Agent Orchestration Use Case
Durable execution shines for multi-step agent workflows where tool calls can fail, external APIs timeout, or human approval gates block progress.
Example: error recovery with fallback tools:
export const dataExtractionAgent = task({
id: "extract-with-fallback",
retry: {
maxAttempts: 3,
factor: 2,
},
run: async ({ documentUrl }: { documentUrl: string }) => {
let extractedData = null;
// Try primary extraction service
try {
extractedData = await primaryExtractor.extract(documentUrl);
} catch (error) {
// Checkpoint before fallback
logger.warn("Primary extractor failed, trying fallback", { error });
// Fallback to OCR + LLM pipeline
const ocrText = await ocrService.process(documentUrl);
extractedData = await llm.generateText({
model: "gpt-4",
prompt: `Extract structured data from: ${ocrText}`,
});
}
// Human approval gate for low-confidence extractions
if (extractedData.confidence < 0.8) {
const approval = await waitForApproval({
data: extractedData,
timeout: "24h",
});
if (!approval.approved) {
throw new Error("Human rejected extraction");
}
extractedData = approval.correctedData;
}
await db.documents.update({
where: { url: documentUrl },
data: { extractedData, status: "processed" },
});
return extractedData;
},
});
If the OCR service times out, the task retries from that checkpoint. The primary extractor failure is logged but not re-attempted. The human approval gate can pause execution for hours without losing state.
Technical Verdict
Use Trigger.dev when:
- Your team is TypeScript-first and wants to avoid Go or Java
- Task execution time is under 4 hours with per-task state under 50MB
- Retry requirements stay below 10 attempts per task run
- You need durable execution without operating a Temporal cluster
- Agent orchestration requires flexible tool calling with side effects
- Managed service is acceptable (or you can self-host the simpler architecture)
- You want built-in observability without custom Prometheus instrumentation
- Workflow latency tolerance is above 100ms per decision point
Avoid Trigger.dev when:
- You need strict determinism guarantees for financial transactions or compliance workflows
- Workflows exceed 7-day duration or require complex versioning during execution
- You require activity heartbeats for long-running operations (multi-hour video encoding, distributed map-reduce jobs)
- Your team already operates Temporal and has Go expertise
- You need saga patterns or advanced compensation logic for distributed transactions
- Per