Before we start, we first need to understand why we need load balancers in the first place.

Why we need Load Balancers?

Suppose you have a platform where daily 10,000 users make requests. And you have only one server which takes the 10,000 requests daily and process it and sends response.

Because only one server is handling 10,000 requests, it starts slowing down. The responses are taking time. To solve you decided to Horizontally scale the server.

In horizontal scaling we spin up new servers to balance the load.

You add another new server.

Now the your 10,000 requests are divided and each server is handling 5,000 requests which make the response much faster. But here new problem arrives.

How clients will know which server to send request to get faster response. Both servers have different IP address and your client don't know which IP address to send request to. To solve this problem, load balancer is introduced.

What is Load Balancer?

Load Balancer is a hardware device which help in redirecting requests from clients to healthy servers which prevents a single server to get overloading with requests.

Formal Definition : A load balancer is a hardware device, software application, or cloud service that distributes incoming network traffic across multiple servers to ensure no single server becomes overloaded, thereby maximizing availability, scalability and performance.

To use load balancer, we first need to save all the IP address of servers that are running on the backend. Now the question arrives, How the load balancer know which server is "Healthy" and which server is not?

Server Health Check

Load Balancers usually have an internal mechanism which periodically check if the server is alive (e.g., by sending a heartbeat requests like HTTP GET /health).

If serve is healthy and alive, it will send the response and load balancer will know that this server is health.
If a server does not respond with a certain threshold, then load balancer will mark that server as unhealthy and will not send any request.
When it recovers, the load balancer can automatically reintroduce it into the rotation.

How Load Balancers work?

The IP address of load balancer is exposed to public which takes all the incoming requests.

To decide which server will get which requests, load balancer use routing algorithms.

Routing Algorithms / Scheduling Algorithms

1. Round Robin

Requests are distributed sequentially to each server in loop.

2. Weighted Round Robin

Each server is assigned a weight (priority). Servers with higher weights receive proportionally more requests.

3. Least Connections

The request goes to server with fewer active connections.

4. IP Hash

The load balancer uses a hash of the client's IP to always route them to the same server (useful for sticky sessions).

When client-1 requests to register, load balancer create a hash and redirect the request to server 1 and for client-2, it's request is redirected to server 2. When client-1 requests to login, load balancer will always redirect client-1 to server 1.

5. Random

Select a server randomly (sometimes used for quick prototypes or specialized cases).

These all the algorithms which are used by load balancers to balance the requests.

Types of Load Balancers

Load Balancers can be categorized in few ways:

Hardware vs Software

Hardware Load Balancer:

Specialized physical devices often used in data centers (e.g., F5, Citrix ADC), They tend to be very powerful but can be expensive and less flexible.

Software Load Balancer:

Runs on standard servers or virtual machines (e.g., Nginx, HAProxy, Envoy). These are often open source or lower-cost solutions, highly configurable, and simpler to integrate with cloud providers.

Layer 4 vs Layer 7

Layer 4 (Transport Layer):

Distributes traffic based on network information like IP address and port. It doesn't inspect the application-layer data (HTTP, HTTPS headers, etc.).

Layer 7 (Application Layer):

Can make distribution decisions based on HTTP headers, cookies, URL path, etc. This is useful for advanced routing and application-aware features.

Benefits of Load Balancer

There are three main benefits of using load balancers:

High Availability: Load balancers protect the application from downtime, ensuring an interrupted user experience in the event of server failure. They also make it possible to do server maintenance without having to take critical applications offline.
Performance: Load balancers improves application response time and performance, therefore reducing latency.
Scalability: Load balancer enable system to handle sudden spikes in traffic and scale resources up or down as needed. They also make it easy to scale server infrastructure without the need for downtime.

Example Use Cases

E-Commerce Website: During peak sale events (like Black Friday), a sudden traffic spike occurs. A load balancer helps distribute that traffic evenly and ensures the system can scale horizontally by adding more servers.
Video Streaming Platform: When thousands of users watch a popular live event, you need a load balancer to ensure seamless streaming.
API Gateway: Modern microservices often place a load balancer or reverse proxy in front of internal services to route API calls based on route paths or hostnames.

Conclusion

A Load Balancer is often at the heart of any modern distributed system. It acts as traffic cop, ensuring that no single server is overwhelmed by requests and that your application remains available—even when individual servers fail.

By using load balancers effectively, you can:

Scale out your infrastructure seamlessly.
Enhance availability and fault tolerance.
Improve performance by offloading tasks like SSL termination and caching.
Maintain your systems more easily, taking servers offline without impacting the entire application.

Load Balancers

Why we need Load Balancers?