Observability
Logs + errors + traces, all correlated by request-id. Free tiers cover us; upgrade only when scale forces.
- For logs, switch from Grafana Loki to Axiom — Axiom free tier is 500 GB ingest/mo with 30-day retention vs Loki's 50 GB. Loki has cardinality footguns at scale.
- Keep Sentry + OTel; route logs to Axiom, traces/metrics to Grafana Cloud Tempo/Prometheus (still free tier).
- OTel Node SDK overhead with BatchSpanProcessor: ~0.5-2ms/req, +2-5% CPU, +10-30MB RAM. Use Batch, not Simple.
Recommended stack
| Concern | Pick | Cost (early) |
|---|---|---|
| Logging | pino v9 → JSON stdout → Grafana Cloud Loki | Free (50 GB/mo) |
| Errors | Sentry (@sentry/nextjs + @sentry/hono) | Free (5k errors/mo) |
| Tracing | OpenTelemetry + @hono/otel → Grafana Cloud Tempo | Free (50 GB traces) |
| Metrics | OTel auto-instrumentations + Prometheus → Grafana Cloud | Free (10k series) |
| Uptime | BetterStack or Cronitor | Free |
| Feature flags + product analytics | PostHog Cloud | Free (<1M events/mo) |
Logging — pino
pnpm --filter @spacehub/api add pino pino-pretty pino-http
pnpm --filter @spacehub/workers add pino
// apps/api/src/lib/logger.ts
import pino from "pino";
export const logger = pino({
level: process.env.LOG_LEVEL ?? "info",
transport: process.env.NODE_ENV === "development"
? { target: "pino-pretty", options: { colorize: true } }
: undefined, // JSON in prod
base: { service: "api", env: process.env.NODE_ENV },
});
Per-request child logger with request-id:
// apps/api/src/middleware/request-logger.ts
import { createMiddleware } from "hono/factory";
import { AsyncLocalStorage } from "node:async_hooks";
import { logger } from "../lib/logger";
export const reqLogStore = new AsyncLocalStorage<pino.Logger>();
export const requestLogger = createMiddleware(async (c, next) => {
const reqId = c.get("requestId");
const child = logger.child({ reqId, method: c.req.method, path: c.req.path });
const start = Date.now();
await reqLogStore.run(child, next);
child.info({ ms: Date.now() - start, status: c.res.status }, "request");
});
// anywhere in handler stack: const log = reqLogStore.getStore() ?? logger;
Error tracking — Sentry
@sentry/nextjs v9+ supports Next 15 App Router + Turbopack + onRequestError. @sentry/hono beta in 2026 replaces deprecated community @hono/sentry.
pnpm --filter @spacehub/api add @sentry/node @sentry/hono
pnpm --filter @spacehub/web add @sentry/nextjs
// apps/api/src/lib/sentry.ts
import * as Sentry from "@sentry/node";
import { sentry } from "@sentry/hono";
Sentry.init({ dsn: process.env.SENTRY_DSN, tracesSampleRate: 0.1, profilesSampleRate: 0.1 });
app.use("*", sentry({ dsn: process.env.SENTRY_DSN! }));
Self-host fallback: GlitchTip (Sentry-protocol compatible) on Coolify if you want everything on one VPS.
Tracing — OpenTelemetry
pnpm --filter @spacehub/api add @opentelemetry/auto-instrumentations-node @hono/otel
// apps/api/src/lib/telemetry.ts (must import BEFORE Hono)
import { NodeSDK } from "@opentelemetry/sdk-node";
import { getNodeAutoInstrumentations } from "@opentelemetry/auto-instrumentations-node";
import { OTLPTraceExporter } from "@opentelemetry/exporter-trace-otlp-http";
new NodeSDK({
traceExporter: new OTLPTraceExporter({ url: process.env.OTEL_EXPORTER_OTLP_ENDPOINT }),
instrumentations: [getNodeAutoInstrumentations({
"@opentelemetry/instrumentation-fs": { enabled: false }, // noisy
})],
}).start();
// In Hono:
import { otel } from "@hono/otel";
app.use("*", otel());
Auto-instrumentations cover: http, pg (Drizzle queries via the driver), ioredis, undici (fetch). Drizzle has no first-party OTel integration — wrap db.execute in a thin span helper for query-level naming if needed.
Metrics
OTel auto-instrumentations export to OTLP. Grafana Cloud accepts OTLP natively. Custom metrics via @opentelemetry/api:
import { metrics } from "@opentelemetry/api";
const meter = metrics.getMeter("spacehub");
const billsGenerated = meter.createCounter("bills_generated_total");
billsGenerated.add(1, { ownerId, period });
Uptime + heartbeats
- BetterStack monitors
/v2/healthfrom multiple regions (incl. Tokyo for Mongolia-relevance). - Scheduler heartbeat: BullMQ JobScheduler emits ping every 5 min;
/healthz/schedulerreturns last-emit timestamp. BetterStack alerts if gap > 10 min.
Feature flags + analytics — PostHog
One tool for flags + product analytics + (optional) session replay. Self-host on Coolify if event volume crosses 1M/mo free cap.
pnpm --filter @spacehub/web add posthog-js
pnpm --filter @spacehub/api add posthog-node
Build steps
- Phase 0:
pino+ request-id middleware (already in scaffold). - Sentry init in API + web (Phase 1).
- OTel SDK + auto-instrumentations + Grafana Cloud free tier (Phase 1).
- BetterStack health monitors + scheduler heartbeat (Phase 1).
- PostHog (Phase 5 or whenever first owner-facing feature flag is needed).
Open questions
- Trace sample rate: 10% default (cost-friendly). Bump to 100% for the first month to baseline.
- PII scrubbing in logs: redact email, phone, TIN by default. Use pino's
redactoption. - Log retention: 30 days at the Loki free tier is enough for ops; longer = paid.
- Audit log (separate from operational logs): durable Postgres table for compliance — auth events, money mutations, role changes. Recommended yes.