AWS × Caddy:一键部署多站点反向代理 + 负载均衡网关(Terraform + ECS Fargate)

0)目录结构

复制代码
aws-caddy-gateway/
├─ caddy/
│  ├─ Caddyfile            # 多站点 + 反代 + 负载均衡
│  └─ Dockerfile           # 自定义镜像,内置 Caddyfile
├─ terraform/
│  ├─ main.tf              # VPC、ALB、ACM、Route53、ECS、EFS、IAM
│  ├─ variables.tf
│  ├─ outputs.tf
│  └─ versions.tf
└─ scripts/
   └─ ecr_push.sh          # 构建并推送 Caddy 镜像到 ECR

1)Caddy(多站点 + 负载均衡)

caddy/Caddyfile

复制代码
{
  # Caddy 在 ALB 之后作为内层反代;外层 TLS 由 ACM 终止
  # 如果你想让 Caddy 自己签证书,见文末 NLB 方案。
  log {
    level INFO
  }
}

# 主站(静态):example.com
example.com {
  encode gzip zstd
  root * /srv/www
  file_server
  respond /health 200
}

# API(多副本负载均衡):api.example.com
api.example.com {
  encode gzip
  reverse_proxy {
    to app1.internal:5000
    to app2.internal:5000
    lb_policy least_conn
    health_uri /health
    health_interval 5s
    health_timeout 2s
    fail_duration 30s
  }
}

# 管理后台:admin.example.com
admin.example.com {
  encode gzip
  reverse_proxy admin.internal:7000
  header {
    X-Powered-By "Caddy on ECS"
  }
}

说明

  • app1.internal/app2.internal/admin.internal 是**ECS 服务发现(Cloud Map)**的内部主机名(在 Terraform 里会开)。

  • 若暂时没有后端,可以先把 reverse_proxy 指向一个占位容器或测试端口。

  • 若你也想让 Caddy 托管静态文件,把构建产物挂到镜像 /srv/www(下面 Dockerfile 已准备)。

caddy/Dockerfile

复制代码
FROM caddy:2.8
# 可选:把静态站点打包进镜像(/srv/www)
# COPY ./site/ /srv/www/
COPY ./Caddyfile /etc/caddy/Caddyfile

2)将 Caddy 镜像推到 ECR

scripts/ecr_push.sh

复制代码
#!/usr/bin/env bash
set -e

AWS_REGION=${AWS_REGION:-"ap-southeast-1"}
REPO_NAME=${REPO_NAME:-"caddy-gateway"}
IMAGE_TAG=${IMAGE_TAG:-"v1"}

ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
REPO_URL="${ACCOUNT_ID}.dkr.ecr.${AWS_REGION}.amazonaws.com/${REPO_NAME}"

aws ecr describe-repositories --repository-names "${REPO_NAME}" --region ${AWS_REGION} >/dev/null 2>&1 || \
  aws ecr create-repository --repository-name "${REPO_NAME}" --region ${AWS_REGION} >/dev/null

aws ecr get-login-password --region ${AWS_REGION} | docker login --username AWS --password-stdin "${ACCOUNT_ID}.dkr.ecr.${AWS_REGION}.amazonaws.com"

docker build -t "${REPO_NAME}:${IMAGE_TAG}" ./caddy
docker tag "${REPO_NAME}:${IMAGE_TAG}" "${REPO_URL}:${IMAGE_TAG}"
docker push "${REPO_URL}:${IMAGE_TAG}"

echo "Pushed: ${REPO_URL}:${IMAGE_TAG}"

执行:

复制代码
chmod +x scripts/ecr_push.sh
AWS_REGION=ap-southeast-1 REPO_NAME=caddy-gateway IMAGE_TAG=v1 ./scripts/ecr_push.sh

拿到输出的 REPO_URL:IMAGE_TAG,待会儿 Terraform 要用。


3)Terraform 基础设施

目标:

  • VPC(2 公有子网)

  • ALB(HTTP→HTTPS,TLS 终止在 ACM)

  • ACM 证书(Route53 自动 DNS 验证)

  • ECS Fargate 集群 + Service(运行 Caddy)

  • Service Discovery(Cloud Map)供 Caddy 反代后端

  • EFS(给 Caddy 保持可选持久化,如将来用作 /data)

terraform/versions.tf

复制代码
terraform {
  required_version = ">= 1.6.0"
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.60"
    }
  }
}

terraform/variables.tf

复制代码
variable "aws_region"      { type = string  default = "ap-southeast-1" }
variable "domain"          { type = string  description = "Root domain, e.g. example.com" }
variable "subdomains"      { type = list(string) default = ["api", "admin"] }
variable "hosted_zone_id"  { type = string  description = "Route53 hosted zone ID for the domain" }
variable "caddy_image"     { type = string  description = "ECR image, e.g. 123456789012.dkr.ecr.ap-southeast-1.amazonaws.com/caddy-gateway:v1" }
variable "caddy_cpu"       { type = number  default = 512 }
variable "caddy_memory"    { type = number  default = 1024 }

terraform/main.tf(精简可跑版)

复制代码
provider "aws" {
  region = var.aws_region
}

# ── VPC(公有子网) ─────────────────────────────────────
resource "aws_vpc" "main" {
  cidr_block           = "10.20.0.0/16"
  enable_dns_hostnames = true
  enable_dns_support   = true
  tags = { Name = "caddy-vpc" }
}

resource "aws_internet_gateway" "igw" {
  vpc_id = aws_vpc.main.id
}

resource "aws_subnet" "public_a" {
  vpc_id                  = aws_vpc.main.id
  cidr_block              = "10.20.1.0/24"
  availability_zone       = data.aws_availability_zones.available.names[0]
  map_public_ip_on_launch = true
  tags = { Name = "public-a" }
}

resource "aws_subnet" "public_b" {
  vpc_id                  = aws_vpc.main.id
  cidr_block              = "10.20.2.0/24"
  availability_zone       = data.aws_availability_zones.available.names[1]
  map_public_ip_on_launch = true
  tags = { Name = "public-b" }
}

data "aws_availability_zones" "available" {}

resource "aws_route_table" "public" {
  vpc_id = aws_vpc.main.id
  route  { cidr_block = "0.0.0.0/0" gateway_id = aws_internet_gateway.igw.id }
}
resource "aws_route_table_association" "a" { subnet_id = aws_subnet.public_a.id route_table_id = aws_route_table.public.id }
resource "aws_route_table_association" "b" { subnet_id = aws_subnet.public_b.id route_table_id = aws_route_table.public.id }

# ── 安全组 ─────────────────────────────────────────────
resource "aws_security_group" "alb_sg" {
  name        = "alb-sg"
  description = "ALB ingress"
  vpc_id      = aws_vpc.main.id
  ingress { protocol = "tcp" from_port = 80  to_port = 80  cidr_blocks = ["0.0.0.0/0"] }
  ingress { protocol = "tcp" from_port = 443 to_port = 443 cidr_blocks = ["0.0.0.0/0"] }
  egress  { protocol = "-1"  from_port = 0   to_port = 0   cidr_blocks = ["0.0.0.0/0"] }
}

resource "aws_security_group" "ecs_sg" {
  name        = "ecs-sg"
  description = "ECS tasks"
  vpc_id      = aws_vpc.main.id
  ingress { protocol = "tcp" from_port = 80 to_port = 80 security_groups = [aws_security_group.alb_sg.id] }
  egress  { protocol = "-1" from_port = 0  to_port = 0  cidr_blocks = ["0.0.0.0/0"] }
}

# ── ACM 证书(DNS 验证) ───────────────────────────────
resource "aws_acm_certificate" "cert" {
  domain_name       = var.domain
  validation_method = "DNS"
  subject_alternative_names = [for s in var.subdomains : "${s}.${var.domain}"]
}

resource "aws_route53_record" "cert_validation" {
  for_each = {
    for dvo in aws_acm_certificate.cert.domain_validation_options :
    dvo.domain_name => {
      name  = dvo.resource_record_name
      type  = dvo.resource_record_type
      value = dvo.resource_record_value
    }
  }
  zone_id = var.hosted_zone_id
  name    = each.value.name
  type    = each.value.type
  ttl     = 60
  records = [each.value.value]
}

resource "aws_acm_certificate_validation" "cert" {
  certificate_arn         = aws_acm_certificate.cert.arn
  validation_record_fqdns = [for r in aws_route53_record.cert_validation : r.fqdn]
}

# ── ALB + 监听 + 目标组 ────────────────────────────────
resource "aws_lb" "alb" {
  name               = "caddy-alb"
  internal           = false
  load_balancer_type = "application"
  security_groups    = [aws_security_group.alb_sg.id]
  subnets            = [aws_subnet.public_a.id, aws_subnet.public_b.id]
}

resource "aws_lb_target_group" "tg_http" {
  name        = "caddy-http"
  port        = 80
  protocol    = "HTTP"
  vpc_id      = aws_vpc.main.id
  target_type = "ip"
  health_check {
    path                = "/health"
    matcher             = "200"
    interval            = 15
    timeout             = 5
    healthy_threshold   = 2
    unhealthy_threshold = 3
  }
}

resource "aws_lb_listener" "http" {
  load_balancer_arn = aws_lb.alb.arn
  port              = 80
  protocol          = "HTTP"
  default_action { type = "redirect" redirect { port = "443" protocol = "HTTPS" status_code = "HTTP_301" } }
}

resource "aws_lb_listener" "https" {
  load_balancer_arn = aws_lb.alb.arn
  port              = 443
  protocol          = "HTTPS"
  ssl_policy        = "ELBSecurityPolicy-TLS13-1-2-2021-06"
  certificate_arn   = aws_acm_certificate_validation.cert.certificate_arn
  default_action { type = "forward" target_group_arn = aws_lb_target_group.tg_http.arn }
}

# ── ECS(Fargate) + 服务发现(Cloud Map) ─────────────
resource "aws_ecs_cluster" "this" { name = "caddy-cluster" }

resource "aws_service_discovery_private_dns_namespace" "ns" {
  name = "internal"
  vpc  = aws_vpc.main.id
}

# 任务执行角色
resource "aws_iam_role" "task_exec" {
  name = "ecsTaskExecutionRole-caddy"
  assume_role_policy = data.aws_iam_policy_document.task_assume.json
}
data "aws_iam_policy_document" "task_assume" {
  statement {
    actions = ["sts:AssumeRole"]
    principals { type = "Service" identifiers = ["ecs-tasks.amazonaws.com"] }
  }
}
resource "aws_iam_role_policy_attachment" "exec_attach" {
  role       = aws_iam_role.task_exec.name
  policy_arn = "arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy"
}

# EFS(可选挂 /data 与 /config)
resource "aws_efs_file_system" "efs" { creation_token = "caddy-efs" throughput_mode = "bursting" }
resource "aws_efs_mount_target" "a" { file_system_id = aws_efs_file_system.efs.id subnet_id = aws_subnet.public_a.id security_groups = [aws_security_group.ecs_sg.id] }
resource "aws_efs_mount_target" "b" { file_system_id = aws_efs_file_system.efs.id subnet_id = aws_subnet.public_b.id security_groups = [aws_security_group.ecs_sg.id] }

# 任务定义(Caddy)
resource "aws_ecs_task_definition" "caddy" {
  family                   = "caddy-gateway"
  network_mode             = "awsvpc"
  cpu                      = var.caddy_cpu
  memory                   = var.caddy_memory
  requires_compatibilities = ["FARGATE"]
  execution_role_arn       = aws_iam_role.task_exec.arn

  volume {
    name = "caddy-data"
    efs_volume_configuration {
      file_system_id = aws_efs_file_system.efs.id
      transit_encryption = "ENABLED"
      root_directory = "/"
    }
  }

  container_definitions = jsonencode([
    {
      "name"      : "caddy",
      "image"     : var.caddy_image,
      "essential" : true,
      "portMappings": [{ "containerPort": 80, "hostPort": 80, "protocol": "tcp" }],
      "logConfiguration": {
        "logDriver": "awslogs",
        "options": {
          "awslogs-group"         : "/ecs/caddy",
          "awslogs-region"        : var.aws_region,
          "awslogs-stream-prefix" : "caddy"
        }
      },
      "mountPoints": [
        { "sourceVolume": "caddy-data", "containerPath": "/data", "readOnly": false }
      ]
    }
  ])
}

# ECS Service(挂到 ALB)
resource "aws_ecs_service" "caddy" {
  name            = "caddy-svc"
  cluster         = aws_ecs_cluster.this.id
  task_definition = aws_ecs_task_definition.caddy.arn
  desired_count   = 2
  launch_type     = "FARGATE"

  network_configuration {
    subnets         = [aws_subnet.public_a.id, aws_subnet.public_b.id]
    security_groups = [aws_security_group.ecs_sg.id]
    assign_public_ip = true
  }

  service_registries {
    registry_arn = aws_service_discovery_service.caddy.arn
  }

  load_balancer {
    target_group_arn = aws_lb_target_group.tg_http.arn
    container_name   = "caddy"
    container_port   = 80
  }

  depends_on = [aws_lb_listener.https]
}

# Caddy 的服务发现条目(网关自身可选)
resource "aws_service_discovery_service" "caddy" {
  name = "caddy"
  dns_config {
    namespace_id = aws_service_discovery_private_dns_namespace.ns.id
    dns_records { ttl = 10 type = "A" }
    routing_policy = "MULTIVALUE"
  }
  health_check_custom_config { failure_threshold = 1 }
}

# 给根域和子域创建 ALB A/AAAA 记录
resource "aws_route53_record" "root_a" {
  zone_id = var.hosted_zone_id
  name    = var.domain
  type    = "A"
  alias {
    name                   = aws_lb.alb.dns_name
    zone_id                = aws_lb.alb.zone_id
    evaluate_target_health = true
  }
}
resource "aws_route53_record" "root_aaaa" {
  zone_id = var.hosted_zone_id
  name    = var.domain
  type    = "AAAA"
  alias {
    name                   = aws_lb.alb.dns_name
    zone_id                = aws_lb.alb.zone_id
    evaluate_target_health = true
  }
}
resource "aws_route53_record" "subs" {
  for_each = toset(var.subdomains)
  zone_id  = var.hosted_zone_id
  name     = "${each.value}.${var.domain}"
  type     = "A"
  alias {
    name                   = aws_lb.alb.dns_name
    zone_id                = aws_lb.alb.zone_id
    evaluate_target_health = true
  }
}

terraform/outputs.tf

复制代码
output "alb_dns"       { value = aws_lb.alb.dns_name }
output "https_urls"    { value = concat([format("https://%s", var.domain)], [for s in var.subdomains : format("https://%s.%s", s, var.domain)]) }
output "log_group"     { value = "/ecs/caddy" }

4)一口气跑起来:执行步骤

  1. 推镜像到 ECR

    • 先执行上面的 scripts/ecr_push.sh,得到 caddy_image 形如:
      123456789012.dkr.ecr.ap-southeast-1.amazonaws.com/caddy-gateway:v1
  2. 配置 Terraform 变量 (可用 terraform.tfvars

    aws_region = "ap-southeast-1"
    domain = "example.com"
    subdomains = ["api", "admin"]
    hosted_zone_id = "Z0123456ABCDEFG" # 你的 Route53 Hosted Zone ID
    caddy_image = "123456789012.dkr.ecr.ap-southeast-1.amazonaws.com/caddy-gateway:v1"

  3. 部署

    cd terraform
    terraform init
    terraform apply -auto-approve

  4. 等待输出

    • https_urls:直接点开访问 https://example.com / https://api.example.com / https://admin.example.com

    • 首次几分钟内 ACM 会完成 DNS 验证并生效;ALB → Caddy(HTTP80) → 上游。

现在你已经拥有:

  • 多站点(根域 + 子域)

  • ALB 终止 TLS(ACM 自动续期)

  • Caddy 反代与内层负载均衡(least_conn + 健康检查)

  • Fargate 双副本,自动跨 AZ 高可用

  • CloudWatch 日志组 /ecs/caddy


5)如何接上你的后端服务?

  • 为你的 API/管理后台分别创建 ECS Service(Fargate) ,并开启 Service Discovery(Cloud Map)。

  • 假设服务注册名是 app1.internalapp2.internaladmin.internal,Caddyfile 已经按这个内网域名反代。

  • 你也可以将后端挂到私有 ALB/NLB,再在 Caddy 里反代其私有 DNS。


6)常见问题(FAQ)

  • 我想让 Caddy 自己管理 TLS/证书

    见下面"方案 B(NLB + Caddy TLS)"。在企业里更推荐当前模板(ALB + ACM),证书可视化与合规更好。

  • 静态站点放哪?

    两种方式:

    1)打进 Caddy 镜像 /srv/www

    2)改用 S3 + CloudFront,再让 Caddy 只做 API 反代。

  • 如何灰度 / 扩容?

    直接调 ECS Service 的 desired_count 或加 AutoScaling Policy,ALB + Fargate 会无损滚更。


7)可选:方案 B(NLB 直通 + Caddy 自签与续签)

如果你必须 由 Caddy 管理证书(例如用 acme_dns route53 做通配符),可改:

  • 把 ALB 换成 NLB(TCP 方式转发 80/443 到 Caddy)

  • 在 Caddyfile 顶部加:

    复制代码
    {
      acme_dns route53 {
        access_key_id     <YOUR_KEY>
        secret_access_key <YOUR_SECRET>
        region            ap-southeast-1
      }
      email admin@example.com
    }
  • Route53 仍然指向 NLB;Caddy 将通过 DNS-01 完成通配符证书签发,TLS 在 Caddy 处终止。

注意:NLB 无 7 层路由与 WAF,观测/规则需要你在 Caddy 层实现;HTTP→HTTPS 跳转也在 Caddy 做。


8)安全与优化建议

  • WAF:若用 ALB,前置 AWS WAF 即刻生效。

  • 最小权限:给 ECS 任务执行角色只保留必要策略;如使用 DNS-01,再额外加 Route53 写权限。

  • HTTP/3:Caddy 原生支持;若用 NLB 直通,开放 UDP/443;ALB 当前由 CloudFront 层补足 QUIC 更常见。

  • 日志:CloudWatch Logs → 设置 Metrics Filter 与 Alarm;或导向 OpenSearch/Loki。

  • 成本:开发可 1c/2GB、双副本也够用;生产请根据 QPS/带宽按需调整。


✅ 总结

推荐模板(ALB + ACM + Caddy on Fargate)

稳定、省心、合规友好,支持多站点、负载均衡、自动扩缩、结构化日志------几条命令就能在 AWS 起一个"现代化入口网关"。

当确需由 Caddy 自管证书时,再切到 NLB + DNS-01 方案。

相关推荐
王道长服务器 | 亚马逊云3 小时前
AWS Auto Scaling:自动扩容,让服务器像呼吸一样灵活
运维·网络·自动化·云计算·aws
技术杠精4 小时前
Docker Swarm 的负载均衡和平滑切换原理
docker·容器·负载均衡·1024程序员节
咕噜企业签名分发-淼淼4 小时前
app分发平台哪个好点?手机app应用内测分发平台支持负载均衡的重要性
运维·智能手机·负载均衡
一个儒雅随和的男子4 小时前
Nginx‌如何配置负载均衡,并使用对不同同负载均衡算法进行配置
运维·nginx·负载均衡
戮戮6 小时前
一次深入排查:Spring Cloud Gateway TCP 连接复用导致 K8s 负载均衡失效
tcp/ip·spring cloud·kubernetes·gateway·负载均衡·netty
MiyueFE10 小时前
使用Powertools for Amazon Lambda简化Amazon AppSync Events集成
前端·aws
likeyou~coucou1 天前
nginx负载均衡
运维·负载均衡
数据智能老司机1 天前
OpenSearch 权威指南——OpenSearch 概览
elasticsearch·搜索引擎·aws
my一阁2 天前
2025-web集群-问题总结
前端·arm开发·数据库·nginx·负载均衡·web