2025-12-09 · 6 min read

A deliberately small job queue

A database table can be enough when the workload is modest and ownership is clear.

A side project needed delayed jobs, retries, and at-least-once delivery. It did not need another service. PostgreSQL was already the durable center of the application, so the queue became a table.

Claim work atomically

select id from jobs
where run_at <= now() and state = 'ready'
order by run_at, id
for update skip locked
limit 1;

The worker claims and marks a row in a short transaction, then performs the slow work outside it. A lease timestamp allows another worker to recover jobs abandoned after a crash.

Make retries visible

Each failure stores a compact error, increments an attempt counter, and schedules the next run with bounded exponential delay. Permanent failures move to a final state instead of disappearing into logs.

This design is not a universal queue. High fan-out, strict ordering, or huge throughput deserve specialized infrastructure. For a few jobs per second, keeping state beside the business transaction removes an entire class of dual-write problems.