Elastic Load Balancers in AWS: What You Need To Know
There's an AWS load balancer to suit every use case. Find out what makes them different and how to choose the right elastic load balancer for your workload.
Want to get up and running fast in the cloud? We provide cloud and DevOps consulting to startups and small to medium-sized enterprise. Schedule a no-obligation call today.
Introduction to Elastic Load Balancing
Elastic load balancing is a critical part of AWS infrastructure design. It is a way to distribute incoming requests across multiple target groups and registered targets automatically. Elastic load balancing ensures even distribution of traffic to registered targets, as well as routing requests from clients to different target groups based on the type of request. Elastic load balancing scales with the use of an auto scaling group to handle incoming traffic and distribute traffic across multiple targets. Connection draining ensures that no new connections are accepted when a target is deregistered or unhealthy. This enables you to take a target out of service without disrupting existing connections. Elastic load balancing supports a range of features for a variety of use cases. AWS offers three types of load balancer: application (ALB), network (NLB), and gateway (GWLB). You may also hear about classic load balancers, which we'll discuss at the very end of this article.
Application Load Balancers
What are application load balancers? This type of load balancer is best suited for load balancing HTTP and HTTPS traffic and can be used to load balance traffic to web applications, including those hosted on Amazon EC2 instances, containers, and IP addresses. They operate at the application layer (layer 7) of the Open System Interconnects (OSI) model. It can route requests based on advanced routing rules such as path-based routing and host-based routing to different target groups. After application elastic load balancers receive requests, they evaluate the listener rules in the priority order and then determine the rule and choose a target group that will take the request. Application load balancers are designed to handle complex application traffic and can provide features such as SSL offloading, content-based routing, and health checks.
Application Load Balancer Features
An application load balancer listens for incoming traffic on a given port and protocol. HTTP listeners typically listen on port 80, while HTTPS listeners typically listen on port 443. When creating an application load balancer, you must specify at least one listener. You can optionally configure additional listeners, depending on your needs. For example, you might want to configure two listeners – one for HTTP traffic on port 80 and one for HTTPS traffic on port 443 – and listener rules route requests to different groups of targets based on the protocol used, or redirect HTTP requests to HTTPS.
A target group is a collection of healthy targets (such as EC2 instances, IPs, containers and now even Lambda functions) that you can load balance. When creating an application load balancer, you must specify at least one group. You can optionally configure additional groups, depending on your needs. For example, you might want to configure two groups – each for a different application - so that traffic is routed to different targets based on the application being requested in the HTTP header, query parameters, etc. You can also configure a custom HTTP response that does not route to targets.
Once you have created a target group, you need to register targets with it. This is typically done by specifying instance ID or IP address of the EC2 instance or on-premises server. You can register healthy targets using the AWS Management Console, Command Line Interface (CLI), or AWS SDK.
A health check is used to determine whether a target is healthy, and therefore able to receive traffic from the application load balancer. When you create a target group, you must specify a health check. These are based on protocol (HTTP/HTTPS), port, and path. The health check will pass or fail depending on the response code received from the target so that the load balancer can route requests to different targets that are healthy.
The routing algorithm used by an application load balancer determines how traffic is routed to targets in a target group. The default routing algorithm for an application load balancer is round-robin, which routes traffic to targets in a target group in equal proportions. Alternatively, you can configure ELBs to use a least-outstanding requests routing algorithm, to route requests to the target with the fewest outstanding requests.
Cross-zone load balancing
By default, an Application Load Balancer distributes traffic evenly across all available zones in a Region. This ensures that your application is highly available and can withstand the loss of an AZ.
You can optionally configure your load balancer to generate access logs, which provide detailed information about the requests made to an application load balancer. This information can be useful for troubleshooting and performance analysis. Logs are disabled by default and when enabled will send logs to an S3 bucket.
Web Application Firewall
AWS WAF can be used with your application load balancer, enabling you to protect your web applications from common web exploits that could affect application availability, compromise security, or consume excessive resources.
Application load balancers support HTTPS listeners, which allow you to terminate SSL/TLS encryption at the load balancer and offload the CPU-intensive work of decrypting traffic to your targets. This is often referred to as TLS offloading or SSL offloading. Certificate Manager is used for this integration.
ALBs support HTTP/2, which allows for multiplexing and header compression. This can improve performance by reducing the amount of data that needs to be transferred between the load balancer and the target.
ALBs support WebSockets, which allows for full-duplex communication between the load balancer and the target. This can be useful for applications that require real-time communication, such as chat applications.
Application load balancers can have sticky sessions enabled to ensure that the same client requests are always routed to the same targets. Sticky sessions prevent session issues where an external session store like DynamoDB or ElastiCache is not being used.
Application load balancers have two types of DNS names – internal and external. Internal DNS names are used by clients within your VPC to access the ALB. External DNS names are used by clients outside of your VPC to access the load balancer. ALBs can be configured to be internal only.
ALBs support security groups, which allow you to control traffic to the load balancer based on the source or destination of the traffic.
Network Load Balancers
Network Load Balancers are best suited for load balancing of TCP and UDP traffic where low latency and extreme performance is required. They operate at the connection level (Layer 4), routing requests to targets based on IP address and port number. NLBs are capable of handling millions of requests per second while maintaining extremely low latencies. NLBs are not supported in all availability zones. Unlike ALBs, a network load balancer does not support security groups so it accepts all traffic that hits its listeners.
Network Load Balancer Features
Network LBs support the TCP, TLS, UDP and TCP_UDP protocols, and they forward requests over TCP and TCP_UDP. Websockets may be used with listeners.
You can register EC2 instances, ECS containers, and IPs, even ALBs as targets with a Network Load Balancer. When you register a target, you specify a port number for the load balancer to route traffic to. With NLBs, if you add an instance to a target group by instance ID, the source IP of a client request will be the client IP. If you add a target by IP address however, for TCP and TLS listeners, the source of the client request will show as the load balancer, but for UDP it will show as the client IP. When using an NLB with a VPC endpoint or global accelerator the source IP will be the NLB node's private IPs.
Like ALBs, NLBs support health checks. The difference is that with NLBs the whole target group is checked as a unit as opposed to each individual target.
Cross-zone load balancing
When more than one availability zone is selected, NLBs distribute traffic evenly across all available zones in a region by creating a load balancer node in the subnet you select.
The idle timeout for a TCP connections is 350 seconds and for UDP connections is 120 seconds. This means that if no data is received from a client within that time, the load balancer will close the connection.
You can choose whether to use ipv4 only or ipv4 and ipv6 (dual-stack) IPs with the NLB. For ipv4 only, TCP, TLS, UDP and TCP_UDP protocols are supported, and for dual-stack NLBs only TCP and TLS listeners are supported. NLBs have one IP address per enabled availability zone.
Network Load Balancers have DNS names automatically attached to them at creation, but you can also attach a custom domain name.
Gateway Load Balancers
Gateway load balancers are more of a networking tool designed for balancing traffic with low latency to third party virtual appliances such as firewalls and monitoring devices such as those offered through the AWS Marketplace. They are used when all incoming traffic must first be inspected before being sent to the destination application. They operate on layer 3 of the OSI model and communicate with target appliances on port 6081 using the GENEVE encapsulation protocol. Unlike other kinds of LBs, GWLBs do not generate logs so you must enable logging within the target appliance. You can however still take advantage of CloudWatch metrics and VPC flow logs to monitor traffic to your GWLB.
Gateway Load Balancer Features
GWLBs listen for all packets across all ports via GWLB endpoints in your VPC. The GWLB endpoint then routes the traffic similarly to PrivateLink to a gateway load balancer in another VPC, though the whole architecture can be set up within the same VPC if desired.
You can register EC2 instances and IPs for on-premises resources as targets with a GWLB.
Though all application communication between the GWLB and targets is on port 6081 using the GENEVE protocol, you can create health checks on other ports and protocols as you can for other elastic load balancing types.
The idle timeout for de-registration for TCP connections to a gateway load balancer is 350 seconds and 120 seconds for non-TCP connections.
GWLBs do not support IPv6 IPs so you must choose IPv4 when creating it.
GWLBs are accessed via route tables in the consumer VPC via gateway load balancer endpoints.
Multi-AZ load balancing
You can configure a GWLB to perform route traffic to targets across availability zones for high availability.
What is a Classic Load Balancer?
We've left classic load balancers for last because they are the oldest and now deprecated form of load balancer in AWS. They are primarily designed for elastic load balancing web traffic and operate at layer 4 of the OSI model. This means a classic load balancer can only balance TCP and SSL traffic and do not support UDP or other protocols. Because a classic load balancer does not understand application protocols, they are also not able to perform cookie-based session persistence. AWS is retiring EC2 Classic on August 15, 2022 and it is currently no longer possible to create a classic load balancer.
In this article, we have covered the basics of elastic load balancing in AWS. We have also discussed the different types of load balancers and their features as well as a word about the now retired classic load balancer. Hopefully, this article has been helpful in providing you with a better understanding of elastic load balancing, how they work, and a good feature comparison to help you choose the right one for your use case. If you have any questions or comments, feel free to reach out to us for assistance.