Lokesh Kumawat.
← All articles

May 28, 2026

Message queues without the foot-guns

Messaging · Distributed Systems

Adding a message queue feels like an unambiguous win: producers stop waiting on slow consumers, traffic spikes get absorbed, and services stop knowing about each other. All true. But queues quietly hand you a new set of failure modes, and they tend to show up in production rather than in the demo.

At-least-once means "expect repeats"

Most brokers (Azure Service Bus, SQS, RabbitMQ) guarantee a message is delivered at least once, not exactly once. A consumer can process a message, crash before acknowledging it, and the broker will redeliver it.

The implication is simple and non-negotiable: your consumers must be idempotent. Processing the same message twice should be safe. (If you read my note on idempotent APIs, the same idempotency-key thinking applies here — dedupe on a message id.)

Poison messages need an exit

One malformed message that always throws will be redelivered forever, blocking the queue and burning your error budget. Every consumer needs a dead-letter path:

  • Cap the delivery attempts (e.g. 5).
  • After the cap, move the message to a dead-letter queue instead of nacking it again.
  • Alert on the dead-letter queue, not on individual failures.

This turns "the consumer is in a crash loop" into "there are 3 messages to look at when I'm awake."

Order is a privilege, not a guarantee

Parallel consumers process messages out of order. If order matters (say, updates to one account), you need a partition key so related messages land on the same consumer:

session/partition key = accountId
→ all events for account 42 stay in order,
  events for different accounts still run in parallel

If order does not matter, don't pay for it — you will throttle your own throughput for nothing.

Make retries back off

A consumer that retries instantly hammers whatever just failed. Exponential backoff with jitter spreads the load and gives the downstream a chance to recover:

wait = min(cap, base * 2^attempt) + random_jitter

A short checklist

  • Consumers are idempotent (dedupe on message id).
  • Dead-letter queue configured, with an alert.
  • Backoff + jitter on retries.
  • Partition key chosen deliberately — for ordering, or not at all.

Queues are still a win. They just reward the teams that treat redelivery, poison messages, and ordering as design inputs rather than 2am surprises.