Background jobs
Replaces v1's external HTTP cron (cron-job.org) + fire-and-forget post-save hooks.
graphile-worker (boring, proven, Postgres-native, battle-tested at GitHub). Hatchet as newer alternative.
Reasoning: hybrid was wrong shape (two delivery semantics, two dashboards); BullMQ has recurring Redis OOM + stalled-job production footguns; pg-boss is single-maintainer (bus factor 1).
Drops Redis from queue requirement. Sections below describe the new pick. See audit findings.
New library pick: graphile-worker
| Option | Verdict |
|---|---|
| graphile-worker v0.16.6 | Pick. Postgres-native (uses LISTEN/NOTIFY). No Redis. Battle-tested at GitHub. Outbox + cron + retries in one tool. Free open source, optional Pro tier ($100/mo) for live migration. |
| Hatchet | Strong alt. Newer, Postgres-native, durable workflows. Processing >1B tasks/month per their stats. YC-backed. |
| Trigger.dev v4 | Most mature self-host path. $10/mo starting tier. |
| Inngest | Best DX for serverless. Free floor squeezed to $75/mo. |
| BullMQ | Skip. Redis OOM, stalled job recovery, eviction storms documented in production. |
| pg-boss | Skip. Single-maintainer, bare-bones dashboard. |
| River | Skip. Go-only. No Node binding. |
graphile-worker pattern
pnpm --filter @spacehub/workers add graphile-worker
// apps/workers/src/index.ts
import { run } from "graphile-worker";
await run({
connectionString: process.env.DATABASE_URL,
concurrency: 8,
taskList: {
"ebarimt.push": async (payload, helpers) => ebarimtPush(payload),
"sms.send": async (payload, helpers) => smsSend(payload),
"push.send": async (payload, helpers) => fcmSend(payload),
"pdf.render": async (payload, helpers) => renderInvoicePdf(payload),
"bills.monthly": async () => runMonthlyBilling(),
"statement.import": async (payload) => importStatement(payload),
"reconcile": async (payload) => reconcileBatch(payload),
},
crontab: `
0 9 10 * * bills.monthly
0 10 * * * contract.expiry
0 * * * * statement.poll
`,
});
Outbox — same domain transaction (the magic)
graphile-worker's add_job is a Postgres function. Call it from inside your domain transaction → job enqueue commits atomically with the business write. No two-phase commit, no Redis race, no outbox table needed:
await tx.execute(sql`
select graphile_worker.add_job('ebarimt.push', ${JSON.stringify({ invoiceId })})
`);
// tx commits → worker drains → eBarimt POST
If you prefer keeping the explicit outbox table for audit/observability, do it — but the job table itself doubles as one with built-in retry/backoff/archive.
What runs in the background
| Job | Trigger | Queue | Cadence |
|---|---|---|---|
| eBarimt receipt push | Sale committed | pg-boss (outbox) | Immediate, retry exp backoff |
| SMS dispatch | App requests | pg-boss (outbox) | Immediate |
| Push notification | Various | pg-boss (outbox) | Immediate |
| Bill generation (monthly) | Cron | BullMQ scheduler | Day-of-month per property |
| Contract expiry SMS | Cron | BullMQ scheduler | Daily 10:00 Asia/Ulaanbaatar |
| Bank statement poll/import | Cron | BullMQ scheduler | Hourly |
| Statement reconciliation | After import | BullMQ (chained) | On demand |
| PDF render (invoice/statement) | User or batch | BullMQ (concurrency 4-8) | On demand |
| FCM token cleanup | Cron | BullMQ scheduler | Weekly |
| Idempotency-key TTL prune | Cron | BullMQ scheduler | Hourly |
The split — and why
- pg-boss for everything (Postgres-native, job lives in same tx as domain write — true outbox semantics, no Redis).
- BullMQ for everything (richer scheduling, priorities, dashboards, Flow Producer, more mature).
Outbox jobs → pg-boss
The whole point of the outbox pattern is the job INSERT happens inside the same Postgres transaction as the business write. With BullMQ, enqueue is a separate Redis write — succeed/fail independently → ghost jobs or lost jobs at the boundary.
pg-boss gives us:
- INSERT inside the domain transaction (no two-phase commit).
- LISTEN/NOTIFY for low-latency drain (workers don't poll).
- Built-in retries, exp backoff, archive table, singletons.
Cron + general queues → BullMQ 5
For everything that doesn't need txn-atomicity (PDF batch, scheduled SMS, hourly bank poll), BullMQ wins on:
- Job Schedulers (replaces deprecated repeatable jobs in 5.16+) —
queue.upsertJobScheduler('monthly-billing', { pattern: '0 0 10 * *' }, { name: 'run' }). - Concurrency knob — 8-16 for IO, 2-4 for CPU.
- Priorities — user-triggered > cron-triggered.
- Flow Producer — parent → child fanout (e.g. "generate package PDFs" parent, one child per invoice).
- Dashboard via
@bull-board/honomounted at/admin/queues.
Install
pnpm --filter @spacehub/workers add pg-boss bullmq @bull-board/api @bull-board/hono ioredis
Outbox schema
// packages/db/src/schema/outbox.ts
export const outbox = pgTable("outbox", {
id: bigserial().primaryKey(),
aggregateType: text("aggregate_type").notNull(), // 'sale' | 'invoice' | 'payment'
aggregateId: uuid("aggregate_id").notNull(),
eventType: text("event_type").notNull(), // 'ebarimt.push' | 'sms.send' | 'push.send'
payload: jsonb().notNull(),
status: text({ enum: ["pending","sent","dead"] }).notNull().default("pending"),
attempts: integer().notNull().default(0),
nextAttemptAt: timestamp("next_attempt_at", { withTimezone: true }).notNull().defaultNow(),
lastError: text("last_error"),
createdAt: timestamp({ withTimezone: true }).notNull().defaultNow(),
sentAt: timestamp("sent_at", { withTimezone: true }),
}, (t) => [
index("outbox_due").on(t.status, t.nextAttemptAt).where(sql`status = 'pending'`),
]);
Pattern in a route handler:
app.post("/v2/sales", withRls, async (c) => {
const tx = c.get("tx");
const sale = await tx.insert(sales).values(...).returning();
// Outbox row goes in the SAME tx as the sale
await tx.insert(outbox).values({
aggregateType: "sale", aggregateId: sale.id,
eventType: "ebarimt.push", payload: { saleId: sale.id },
});
return c.json(sale);
// Tx commits → outbox row visible → pg-boss worker drains → eBarimt POST
});
Outbox worker (pg-boss)
// apps/workers/src/outbox.ts
import PgBoss from "pg-boss";
import { ebarimtPush, sendSms, sendPush } from "@spacehub/integrations";
const boss = new PgBoss({ connectionString: process.env.DATABASE_URL });
await boss.start();
await boss.work("outbox-drain", { batchSize: 50, pollingIntervalSeconds: 2 }, async (jobs) => {
// jobs handed from pg-boss are decoupled from the domain outbox table —
// we use boss as the *worker scheduler*, the table is the source of truth.
// Alternative: skip pg-boss and use Postgres LISTEN/NOTIFY directly.
});
(Simpler alternative: skip pg-boss entirely for the outbox, use a tiny worker that LISTEN outbox_new + polls every 5s as fallback. Pure Postgres, no extra dep. Pick this if pg-boss adds friction.)
BullMQ patterns
Scheduler (cron-style)
// apps/workers/src/schedulers.ts
import { Queue, Worker } from "bullmq";
import { connection } from "./redis";
const billingQ = new Queue("billing", { connection });
// Monthly billing: 10th of each month at 09:00 Asia/Ulaanbaatar
await billingQ.upsertJobScheduler("monthly-billing", {
pattern: "0 9 10 * *",
tz: "Asia/Ulaanbaatar",
}, { name: "run-monthly-bills" });
new Worker("billing", async (job) => {
if (job.name === "run-monthly-bills") {
await generateBillsForAllActiveContracts();
}
}, { connection, concurrency: 4 });
Priority + retries
await queue.add("send-receipt", { saleId }, {
priority: 1, // 1 = highest
attempts: 5,
backoff: { type: "exponential", delay: 5_000 }, // 5s, 10s, 20s, 40s, 80s
});
Flow Producer (parent + children)
import { FlowProducer } from "bullmq";
const flow = new FlowProducer({ connection });
await flow.add({
name: "generate-package", queueName: "pdf",
data: { packageId },
children: invoices.map(inv => ({
name: "render-invoice", queueName: "pdf", data: { invoiceId: inv.id }
})),
});
// Parent only runs after all children complete
Dashboard
// apps/api/src/lib/queue-dashboard.ts
import { Hono } from "hono";
import { createBullBoard } from "@bull-board/api";
import { BullMQAdapter } from "@bull-board/api/bullMQAdapter";
import { HonoAdapter } from "@bull-board/hono";
const adapter = new HonoAdapter();
createBullBoard({
queues: [
new BullMQAdapter(billingQ),
new BullMQAdapter(pdfQ),
/* ... */
],
serverAdapter: adapter,
});
adapter.setBasePath("/admin/queues");
// Mount behind admin auth
app.route("/admin/queues", adapter.registerPlugin());
Replacing v1's external HTTP cron
v1 has cron-job.org pinging /api/Automation/* endpoints. v2 moves cron in-process:
- BullMQ JobScheduler in
apps/workersemits all schedules. - Single external probe:
cron-job.orghitsGET /healthz/schedulerevery 15 min; alerts if the scheduler stopped firing (last-run timestamp older than expected gap).
Process layout
apps/workers/
├── src/
│ ├── index.ts # boot order: redis → pg-boss → workers → schedulers
│ ├── outbox.ts # pg-boss / LISTEN-NOTIFY drain
│ ├── billing.ts # monthly bill run
│ ├── statements.ts # bank statement import + reconciliation
│ ├── pdf.ts # render queue (react-pdf)
│ ├── sms.ts # 131344 dispatch
│ ├── push.ts # firebase-admin dispatch
│ ├── ebarimt.ts # eBarimt 3.0 push
│ └── schedulers.ts # upserts all JobSchedulers on boot
├── Dockerfile
└── package.json
Deploy as a separate container next to the API. Scale horizontally — multiple worker instances are safe (BullMQ + pg-boss handle distribution).
Open questions
- pg-boss vs hand-rolled LISTEN/NOTIFY for the outbox? pg-boss adds a dep but gives you retry/backoff/archive for free. Hand-rolled is ~80 lines but means you own the retry logic. Default: pg-boss; switch to hand-roll if it adds friction.
- One worker process or per-queue process? One process with multiple Workers is simpler. Split only if a queue starves others (e.g. PDF render saturates CPU). Use
concurrency+ queue priorities first. - Dashboard auth: behind admin RLS-checked middleware, or basic-auth env? Recommend admin middleware (reuses auth).