High Availability Architecture on AWS
A comprehensive guide to building a fault-tolerant, auto-scaling web application architecture on AWS using Terraform.
Project Overview
Designing for failure is a core principle of cloud computing. This project demonstrates how to build a highly available (HA) and fault-tolerant web application architecture on AWS using Infrastructure as Code (IaC) with Terraform.
The goal is to create an infrastructure that can withstand the failure of a single Availability Zone (AZ) without service interruption and automatically scale based on traffic demand.
Architecture
The architecture consists of the following components distributed across two Availability Zones:
- VPC: A custom Virtual Private Cloud with public and private subnets.
- Application Load Balancer (ALB): Distributes incoming traffic to EC2 instances.
- Auto Scaling Group (ASG): Automatically adjusts the number of EC2 instances based on CPU utilization.
- EC2 Instances: Web servers running Nginx, placed in private subnets for security.
- NAT Gateways: Allow private instances to access the internet for updates without being exposed.
- Security Groups: Strict firewall rules to control traffic flow.
Infrastructure as Code (Terraform)
Below are the key Terraform configurations used to provision this infrastructure.
1. VPC and Networking
We start by defining the VPC and subnets. We use the terraform-aws-modules/vpc/aws module for best practices.
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
name = "ha-vpc"
cidr = "10.0.0.0/16"
azs = ["us-east-1a", "us-east-1b"]
private_subnets = ["10.0.1.0/24", "10.0.2.0/24"]
public_subnets = ["10.0.101.0/24", "10.0.102.0/24"]
enable_nat_gateway = true
single_nat_gateway = false # For high availability, we need one per AZ
enable_vpn_gateway = false
tags = {
Terraform = "true"
Environment = "prod"
}
}
2. Security Groups
Security is paramount. We allow HTTP traffic only from the Load Balancer to the web servers.
resource "aws_security_group" "alb_sg" {
name = "alb-sg"
description = "Allow HTTP inbound traffic"
vpc_id = module.vpc.vpc_id
ingress {
description = "HTTP from Internet"
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
}
resource "aws_security_group" "web_sg" {
name = "web-server-sg"
description = "Allow HTTP from ALB"
vpc_id = module.vpc.vpc_id
ingress {
description = "HTTP from ALB"
from_port = 80
to_port = 80
protocol = "tcp"
security_groups = [aws_security_group.alb_sg.id]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
}
3. Application Load Balancer
The ALB serves as the entry point for our application.
resource "aws_lb" "main" {
name = "ha-alb"
internal = false
load_balancer_type = "application"
security_groups = [aws_security_group.alb_sg.id]
subnets = module.vpc.public_subnets
}
resource "aws_lb_listener" "front_end" {
load_balancer_arn = aws_lb.main.arn
port = "80"
protocol = "HTTP"
default_action {
type = "forward"
target_group_arn = aws_lb_target_group.web.arn
}
}
resource "aws_lb_target_group" "web" {
name = "web-target-group"
port = 80
protocol = "HTTP"
vpc_id = module.vpc.vpc_id
health_check {
path = "/"
healthy_threshold = 2
}
}
4. Auto Scaling Group
Finally, we configure the Auto Scaling Group to manage our EC2 instances.
resource "aws_launch_template" "web" {
name_prefix = "web-server-"
image_id = "ami-0c55b159cbfafe1f0" # Amazon Linux 2
instance_type = "t3.micro"
network_interfaces {
associate_public_ip_address = false
security_groups = [aws_security_group.web_sg.id]
}
user_data = base64encode(<<-EOF
#!/bin/bash
yum update -y
yum install -y httpd
systemctl start httpd
systemctl enable httpd
echo "<h1>Hello from $(hostname -f)</h1>" > /var/www/html/index.html
EOF
)
}
resource "aws_autoscaling_group" "web" {
desired_capacity = 2
max_size = 4
min_size = 2
vpc_zone_identifier = module.vpc.private_subnets
target_group_arns = [aws_lb_target_group.web.arn]
launch_template {
id = aws_launch_template.web.id
version = "$Latest"
}
}
Deployment
To deploy this infrastructure:
- Initialize Terraform:
terraform init - Review the plan:
terraform plan - Apply the changes:
terraform apply
Conclusion
This architecture provides a robust foundation for production workloads. By leveraging Terraform, we ensure that the infrastructure is reproducible, version-controlled, and easy to manage. The use of Auto Scaling and Multi-AZ deployment ensures that the application remains available even during high traffic or infrastructure failures.