0)目录结构
aws-caddy-gateway/
├─ caddy/
│ ├─ Caddyfile # 多站点 + 反代 + 负载均衡
│ └─ Dockerfile # 自定义镜像,内置 Caddyfile
├─ terraform/
│ ├─ main.tf # VPC、ALB、ACM、Route53、ECS、EFS、IAM
│ ├─ variables.tf
│ ├─ outputs.tf
│ └─ versions.tf
└─ scripts/
└─ ecr_push.sh # 构建并推送 Caddy 镜像到 ECR
1)Caddy(多站点 + 负载均衡)
caddy/Caddyfile
{
# Caddy 在 ALB 之后作为内层反代;外层 TLS 由 ACM 终止
# 如果你想让 Caddy 自己签证书,见文末 NLB 方案。
log {
level INFO
}
}
# 主站(静态):example.com
example.com {
encode gzip zstd
root * /srv/www
file_server
respond /health 200
}
# API(多副本负载均衡):api.example.com
api.example.com {
encode gzip
reverse_proxy {
to app1.internal:5000
to app2.internal:5000
lb_policy least_conn
health_uri /health
health_interval 5s
health_timeout 2s
fail_duration 30s
}
}
# 管理后台:admin.example.com
admin.example.com {
encode gzip
reverse_proxy admin.internal:7000
header {
X-Powered-By "Caddy on ECS"
}
}
说明
app1.internal/app2.internal/admin.internal是**ECS 服务发现(Cloud Map)**的内部主机名(在 Terraform 里会开)。若暂时没有后端,可以先把
reverse_proxy指向一个占位容器或测试端口。若你也想让 Caddy 托管静态文件,把构建产物挂到镜像
/srv/www(下面 Dockerfile 已准备)。
caddy/Dockerfile
FROM caddy:2.8
# 可选:把静态站点打包进镜像(/srv/www)
# COPY ./site/ /srv/www/
COPY ./Caddyfile /etc/caddy/Caddyfile
2)将 Caddy 镜像推到 ECR
scripts/ecr_push.sh
#!/usr/bin/env bash
set -e
AWS_REGION=${AWS_REGION:-"ap-southeast-1"}
REPO_NAME=${REPO_NAME:-"caddy-gateway"}
IMAGE_TAG=${IMAGE_TAG:-"v1"}
ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
REPO_URL="${ACCOUNT_ID}.dkr.ecr.${AWS_REGION}.amazonaws.com/${REPO_NAME}"
aws ecr describe-repositories --repository-names "${REPO_NAME}" --region ${AWS_REGION} >/dev/null 2>&1 || \
aws ecr create-repository --repository-name "${REPO_NAME}" --region ${AWS_REGION} >/dev/null
aws ecr get-login-password --region ${AWS_REGION} | docker login --username AWS --password-stdin "${ACCOUNT_ID}.dkr.ecr.${AWS_REGION}.amazonaws.com"
docker build -t "${REPO_NAME}:${IMAGE_TAG}" ./caddy
docker tag "${REPO_NAME}:${IMAGE_TAG}" "${REPO_URL}:${IMAGE_TAG}"
docker push "${REPO_URL}:${IMAGE_TAG}"
echo "Pushed: ${REPO_URL}:${IMAGE_TAG}"
执行:
chmod +x scripts/ecr_push.sh
AWS_REGION=ap-southeast-1 REPO_NAME=caddy-gateway IMAGE_TAG=v1 ./scripts/ecr_push.sh
拿到输出的 REPO_URL:IMAGE_TAG,待会儿 Terraform 要用。
3)Terraform 基础设施
目标:
VPC(2 公有子网)
ALB(HTTP→HTTPS,TLS 终止在 ACM)
ACM 证书(Route53 自动 DNS 验证)
ECS Fargate 集群 + Service(运行 Caddy)
Service Discovery(Cloud Map)供 Caddy 反代后端
EFS(给 Caddy 保持可选持久化,如将来用作 /data)
terraform/versions.tf
terraform {
required_version = ">= 1.6.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.60"
}
}
}
terraform/variables.tf
variable "aws_region" { type = string default = "ap-southeast-1" }
variable "domain" { type = string description = "Root domain, e.g. example.com" }
variable "subdomains" { type = list(string) default = ["api", "admin"] }
variable "hosted_zone_id" { type = string description = "Route53 hosted zone ID for the domain" }
variable "caddy_image" { type = string description = "ECR image, e.g. 123456789012.dkr.ecr.ap-southeast-1.amazonaws.com/caddy-gateway:v1" }
variable "caddy_cpu" { type = number default = 512 }
variable "caddy_memory" { type = number default = 1024 }
terraform/main.tf(精简可跑版)
provider "aws" {
region = var.aws_region
}
# ── VPC(公有子网) ─────────────────────────────────────
resource "aws_vpc" "main" {
cidr_block = "10.20.0.0/16"
enable_dns_hostnames = true
enable_dns_support = true
tags = { Name = "caddy-vpc" }
}
resource "aws_internet_gateway" "igw" {
vpc_id = aws_vpc.main.id
}
resource "aws_subnet" "public_a" {
vpc_id = aws_vpc.main.id
cidr_block = "10.20.1.0/24"
availability_zone = data.aws_availability_zones.available.names[0]
map_public_ip_on_launch = true
tags = { Name = "public-a" }
}
resource "aws_subnet" "public_b" {
vpc_id = aws_vpc.main.id
cidr_block = "10.20.2.0/24"
availability_zone = data.aws_availability_zones.available.names[1]
map_public_ip_on_launch = true
tags = { Name = "public-b" }
}
data "aws_availability_zones" "available" {}
resource "aws_route_table" "public" {
vpc_id = aws_vpc.main.id
route { cidr_block = "0.0.0.0/0" gateway_id = aws_internet_gateway.igw.id }
}
resource "aws_route_table_association" "a" { subnet_id = aws_subnet.public_a.id route_table_id = aws_route_table.public.id }
resource "aws_route_table_association" "b" { subnet_id = aws_subnet.public_b.id route_table_id = aws_route_table.public.id }
# ── 安全组 ─────────────────────────────────────────────
resource "aws_security_group" "alb_sg" {
name = "alb-sg"
description = "ALB ingress"
vpc_id = aws_vpc.main.id
ingress { protocol = "tcp" from_port = 80 to_port = 80 cidr_blocks = ["0.0.0.0/0"] }
ingress { protocol = "tcp" from_port = 443 to_port = 443 cidr_blocks = ["0.0.0.0/0"] }
egress { protocol = "-1" from_port = 0 to_port = 0 cidr_blocks = ["0.0.0.0/0"] }
}
resource "aws_security_group" "ecs_sg" {
name = "ecs-sg"
description = "ECS tasks"
vpc_id = aws_vpc.main.id
ingress { protocol = "tcp" from_port = 80 to_port = 80 security_groups = [aws_security_group.alb_sg.id] }
egress { protocol = "-1" from_port = 0 to_port = 0 cidr_blocks = ["0.0.0.0/0"] }
}
# ── ACM 证书(DNS 验证) ───────────────────────────────
resource "aws_acm_certificate" "cert" {
domain_name = var.domain
validation_method = "DNS"
subject_alternative_names = [for s in var.subdomains : "${s}.${var.domain}"]
}
resource "aws_route53_record" "cert_validation" {
for_each = {
for dvo in aws_acm_certificate.cert.domain_validation_options :
dvo.domain_name => {
name = dvo.resource_record_name
type = dvo.resource_record_type
value = dvo.resource_record_value
}
}
zone_id = var.hosted_zone_id
name = each.value.name
type = each.value.type
ttl = 60
records = [each.value.value]
}
resource "aws_acm_certificate_validation" "cert" {
certificate_arn = aws_acm_certificate.cert.arn
validation_record_fqdns = [for r in aws_route53_record.cert_validation : r.fqdn]
}
# ── ALB + 监听 + 目标组 ────────────────────────────────
resource "aws_lb" "alb" {
name = "caddy-alb"
internal = false
load_balancer_type = "application"
security_groups = [aws_security_group.alb_sg.id]
subnets = [aws_subnet.public_a.id, aws_subnet.public_b.id]
}
resource "aws_lb_target_group" "tg_http" {
name = "caddy-http"
port = 80
protocol = "HTTP"
vpc_id = aws_vpc.main.id
target_type = "ip"
health_check {
path = "/health"
matcher = "200"
interval = 15
timeout = 5
healthy_threshold = 2
unhealthy_threshold = 3
}
}
resource "aws_lb_listener" "http" {
load_balancer_arn = aws_lb.alb.arn
port = 80
protocol = "HTTP"
default_action { type = "redirect" redirect { port = "443" protocol = "HTTPS" status_code = "HTTP_301" } }
}
resource "aws_lb_listener" "https" {
load_balancer_arn = aws_lb.alb.arn
port = 443
protocol = "HTTPS"
ssl_policy = "ELBSecurityPolicy-TLS13-1-2-2021-06"
certificate_arn = aws_acm_certificate_validation.cert.certificate_arn
default_action { type = "forward" target_group_arn = aws_lb_target_group.tg_http.arn }
}
# ── ECS(Fargate) + 服务发现(Cloud Map) ─────────────
resource "aws_ecs_cluster" "this" { name = "caddy-cluster" }
resource "aws_service_discovery_private_dns_namespace" "ns" {
name = "internal"
vpc = aws_vpc.main.id
}
# 任务执行角色
resource "aws_iam_role" "task_exec" {
name = "ecsTaskExecutionRole-caddy"
assume_role_policy = data.aws_iam_policy_document.task_assume.json
}
data "aws_iam_policy_document" "task_assume" {
statement {
actions = ["sts:AssumeRole"]
principals { type = "Service" identifiers = ["ecs-tasks.amazonaws.com"] }
}
}
resource "aws_iam_role_policy_attachment" "exec_attach" {
role = aws_iam_role.task_exec.name
policy_arn = "arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy"
}
# EFS(可选挂 /data 与 /config)
resource "aws_efs_file_system" "efs" { creation_token = "caddy-efs" throughput_mode = "bursting" }
resource "aws_efs_mount_target" "a" { file_system_id = aws_efs_file_system.efs.id subnet_id = aws_subnet.public_a.id security_groups = [aws_security_group.ecs_sg.id] }
resource "aws_efs_mount_target" "b" { file_system_id = aws_efs_file_system.efs.id subnet_id = aws_subnet.public_b.id security_groups = [aws_security_group.ecs_sg.id] }
# 任务定义(Caddy)
resource "aws_ecs_task_definition" "caddy" {
family = "caddy-gateway"
network_mode = "awsvpc"
cpu = var.caddy_cpu
memory = var.caddy_memory
requires_compatibilities = ["FARGATE"]
execution_role_arn = aws_iam_role.task_exec.arn
volume {
name = "caddy-data"
efs_volume_configuration {
file_system_id = aws_efs_file_system.efs.id
transit_encryption = "ENABLED"
root_directory = "/"
}
}
container_definitions = jsonencode([
{
"name" : "caddy",
"image" : var.caddy_image,
"essential" : true,
"portMappings": [{ "containerPort": 80, "hostPort": 80, "protocol": "tcp" }],
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group" : "/ecs/caddy",
"awslogs-region" : var.aws_region,
"awslogs-stream-prefix" : "caddy"
}
},
"mountPoints": [
{ "sourceVolume": "caddy-data", "containerPath": "/data", "readOnly": false }
]
}
])
}
# ECS Service(挂到 ALB)
resource "aws_ecs_service" "caddy" {
name = "caddy-svc"
cluster = aws_ecs_cluster.this.id
task_definition = aws_ecs_task_definition.caddy.arn
desired_count = 2
launch_type = "FARGATE"
network_configuration {
subnets = [aws_subnet.public_a.id, aws_subnet.public_b.id]
security_groups = [aws_security_group.ecs_sg.id]
assign_public_ip = true
}
service_registries {
registry_arn = aws_service_discovery_service.caddy.arn
}
load_balancer {
target_group_arn = aws_lb_target_group.tg_http.arn
container_name = "caddy"
container_port = 80
}
depends_on = [aws_lb_listener.https]
}
# Caddy 的服务发现条目(网关自身可选)
resource "aws_service_discovery_service" "caddy" {
name = "caddy"
dns_config {
namespace_id = aws_service_discovery_private_dns_namespace.ns.id
dns_records { ttl = 10 type = "A" }
routing_policy = "MULTIVALUE"
}
health_check_custom_config { failure_threshold = 1 }
}
# 给根域和子域创建 ALB A/AAAA 记录
resource "aws_route53_record" "root_a" {
zone_id = var.hosted_zone_id
name = var.domain
type = "A"
alias {
name = aws_lb.alb.dns_name
zone_id = aws_lb.alb.zone_id
evaluate_target_health = true
}
}
resource "aws_route53_record" "root_aaaa" {
zone_id = var.hosted_zone_id
name = var.domain
type = "AAAA"
alias {
name = aws_lb.alb.dns_name
zone_id = aws_lb.alb.zone_id
evaluate_target_health = true
}
}
resource "aws_route53_record" "subs" {
for_each = toset(var.subdomains)
zone_id = var.hosted_zone_id
name = "${each.value}.${var.domain}"
type = "A"
alias {
name = aws_lb.alb.dns_name
zone_id = aws_lb.alb.zone_id
evaluate_target_health = true
}
}
terraform/outputs.tf
output "alb_dns" { value = aws_lb.alb.dns_name }
output "https_urls" { value = concat([format("https://%s", var.domain)], [for s in var.subdomains : format("https://%s.%s", s, var.domain)]) }
output "log_group" { value = "/ecs/caddy" }
4)一口气跑起来:执行步骤
-
推镜像到 ECR
- 先执行上面的
scripts/ecr_push.sh,得到caddy_image形如:
123456789012.dkr.ecr.ap-southeast-1.amazonaws.com/caddy-gateway:v1
- 先执行上面的
-
配置 Terraform 变量 (可用
terraform.tfvars)aws_region = "ap-southeast-1"
domain = "example.com"
subdomains = ["api", "admin"]
hosted_zone_id = "Z0123456ABCDEFG" # 你的 Route53 Hosted Zone ID
caddy_image = "123456789012.dkr.ecr.ap-southeast-1.amazonaws.com/caddy-gateway:v1" -
部署
cd terraform
terraform init
terraform apply -auto-approve -
等待输出
-
https_urls:直接点开访问https://example.com / https://api.example.com / https://admin.example.com -
首次几分钟内 ACM 会完成 DNS 验证并生效;ALB → Caddy(HTTP80) → 上游。
-
现在你已经拥有:
多站点(根域 + 子域)
ALB 终止 TLS(ACM 自动续期)
Caddy 反代与内层负载均衡(least_conn + 健康检查)
Fargate 双副本,自动跨 AZ 高可用
CloudWatch 日志组
/ecs/caddy
5)如何接上你的后端服务?
-
为你的 API/管理后台分别创建 ECS Service(Fargate) ,并开启 Service Discovery(Cloud Map)。
-
假设服务注册名是
app1.internal、app2.internal、admin.internal,Caddyfile 已经按这个内网域名反代。 -
你也可以将后端挂到私有 ALB/NLB,再在 Caddy 里反代其私有 DNS。
6)常见问题(FAQ)
-
我想让 Caddy 自己管理 TLS/证书
见下面"方案 B(NLB + Caddy TLS)"。在企业里更推荐当前模板(ALB + ACM),证书可视化与合规更好。
-
静态站点放哪?
两种方式:
1)打进 Caddy 镜像
/srv/www;2)改用 S3 + CloudFront,再让 Caddy 只做 API 反代。
-
如何灰度 / 扩容?
直接调 ECS Service 的
desired_count或加AutoScalingPolicy,ALB + Fargate 会无损滚更。
7)可选:方案 B(NLB 直通 + Caddy 自签与续签)
如果你必须 由 Caddy 管理证书(例如用 acme_dns route53 做通配符),可改:
-
把 ALB 换成 NLB(TCP 方式转发 80/443 到 Caddy);
-
在 Caddyfile 顶部加:
{ acme_dns route53 { access_key_id <YOUR_KEY> secret_access_key <YOUR_SECRET> region ap-southeast-1 } email admin@example.com } -
Route53 仍然指向 NLB;Caddy 将通过 DNS-01 完成通配符证书签发,TLS 在 Caddy 处终止。
注意:NLB 无 7 层路由与 WAF,观测/规则需要你在 Caddy 层实现;HTTP→HTTPS 跳转也在 Caddy 做。
8)安全与优化建议
-
WAF:若用 ALB,前置 AWS WAF 即刻生效。
-
最小权限:给 ECS 任务执行角色只保留必要策略;如使用 DNS-01,再额外加 Route53 写权限。
-
HTTP/3:Caddy 原生支持;若用 NLB 直通,开放 UDP/443;ALB 当前由 CloudFront 层补足 QUIC 更常见。
-
日志:CloudWatch Logs → 设置 Metrics Filter 与 Alarm;或导向 OpenSearch/Loki。
-
成本:开发可 1c/2GB、双副本也够用;生产请根据 QPS/带宽按需调整。
✅ 总结
推荐模板(ALB + ACM + Caddy on Fargate) :
稳定、省心、合规友好,支持多站点、负载均衡、自动扩缩、结构化日志------几条命令就能在 AWS 起一个"现代化入口网关"。
当确需由 Caddy 自管证书时,再切到 NLB + DNS-01 方案。
