Skip to main content

Command Palette

Search for a command to run...

Master Queues

Updated
6 min read
Master Queues
H
Learning and learning

Queues are a fundamental concept in system design. This article will explain their significance.

Suppose you got new users that are trying to register on your platform. The basic flow of this process is

The pseudo code for process:

  • Insert user in DB

  • Sends a Welcome Email

  • Generate Token

  • Send Response

This process is called Sync Process because all the events are happening in synchronously, first we insert user in DB then sends Welcome Email then generate token and then send response to user.

Now, this is good for few users like 10 users registering per minute.

But if there are like 10,000 users trying to register at the same minute, the application will crash because of email worker which is an external service. Email work is send the data to Gmail servers to send Welcome Emails and this is the problem. Because Gmail has a rate limit of ~20 emails/minute and you will hit it pretty fast.

To prevent the crash from happening, we can use async process.

In async process, we use queues to store the data of email and send the this data to email workers one by one.

The pseudo code for this process:

  • Insert user in DB

  • MessageQueue.enqueue(Send Welcome Email)

  • Generate Token

  • Send Response

In this, we are not sending the email when new user come but we just enqueue the email data in queue and tell the email server to process only 20 emails per minute and send it to Gmail servers.

That's why when you register to new website, you get Welcome email after few seconds or minute because the email workers are process other emails before you.

This whole design is called HLD - High Level design. We just discussed how we use queues to handle emails but we still don't know how these queues works internally.

How queue work internally?

In queue operations, we have a producer which enqueue the data in queue and a consumer which takes the data from queue.

To get data from the queue, there are two methods:

  • Push Mechanism

  • Pull Mechanism

Push Based Mechanism

In Push Mechanism, the queue automatically give the data to consumers.

For this push mechanism, we can use RabbitMQ. RabbitMQ provides Push based mechanism for your platform.

To use the RabbitMQ, the producer have to tell the RabbitMQ that it will give data to queue.

For the workers to get the data from the queue, the first need to register themselves in RabbitMQ.

Then RabbitMQ will store which workers are register and when a data come, then Queue will push the data to registered workers.

There also a "Heartbeat" mechanism where all the registered workers will send their heartbeat to the RabbitMQ and if a worker failed to send its heartbeat for more than a certain time (like 1 min) then RabbitMQ will declare that worker to be dead and will not send the data to that worker.

Benefits of using RabbitMQ

  • Decoupling of Services: Producers and Consumers don't need to know each other or be running at the same time. Your auth service can push "user registered" event and move on - the email service will consume it independently.

  • Load Leveling / Traffic Spikes: RabbitMQ acts as a buffer. If 10,000 users register at once, messages queue up and consumers process them at a manageable pace — instead of overwhelming your downstream services.

  • Reliability and Message Durability: Messages can be persisted to disk so they survive broker restarts. Combined with acknowledgements (acks), you get guaranteed delivery — a message is only removed from the queue after the consumer confirms it was processed successfully.

    • Durable queues — survive broker restart

    • Persistent messages — written to disk

    • Acknowledgements — no message lost if consumer crashes mid-processing

  • Scalability: Easily scale consumers horizontally. Add more worker instances to process messages faster — RabbitMQ distributes messages across them automatically via round-robin dispatching.

  • Multiple Messaging patterns: RabbitMQ supports several patterns in one tool:

    • Work Queues — distribute tasks among workers

    • Pub/Sub — broadcast events to multiple subscribers

    • RPC (Request/Reply) — synchronous-style calls over async messaging

    • Dead Letter Queues — handle failed messages gracefully

  • Dead Letter Queues: Messages that fail repeatedly don't just disappear — they get routed to a DLQ where you can inspect, debug, and replay them. Critical for production reliability.

Pull Based Mechanism

In pull based mechanism, a worker is responsible for pulling the message data from the message queue.

The main advantage of Pull based mechanism

  • You have the full control

Cons

  • You have the full control

You can manage how to manage the queue and workers when pulling the message data. But you have to manage it all like, to prevent deduplication - you have to write programs, for rate limiting.

To prevent deduplication, we can use Redis. When a worker takes a message data, it tells the Redis that I have message data_1 and acquires a lock and when another worker tries to take the same message, it check the Redis first to know if the message is already taken or not. If taken, it takes another non taken message data.

If a worker failed to process the email or if Gmail / external email server failed to send email then you have to write program to handle that message data and enqueue it back to the queue or create a new queue which store all these type of emails and when main queue is free then you enqueue the data back to main queue.

To pull the message data from the queue, you have to use polling to ask the queue of it has the data again and again. This also create the problem, like for every API call for queue, you have to pay some money. To save money, you have to implement Backoff mechanism.

What is Backoff mechanism?

A backoff mechanism is a strategy where a system waits for an increasing amount of time before retrying a failed operation — instead of retrying immediately and repeatedly, which can make things worse.

"Something failed — wait a bit, then try again. If it fails again — wait longer."

In this, when a worker is asking the queue for the data for 20 minutes and there is no data then you ask the queue every 5 minutes ad again for like 20-30 minutes and you get 1-2 data then you ask every 10 minutes.

For Pull based mechanism, we can use

  • BullMQ

  • SQS

This is all I know currently and I hope you understand why queue is one of the most important concept in System design. Thanks for reading.