灰度发布具体操作指南(手把手版)
以下是基于真实生产环境的详细操作流程,包含从代码提交到全量发布的全链路落地步骤,以Kubernetes + Istio为例:
一、前置准备
-
环境隔离
bash# 创建独立Namespace kubectl create ns gray-release kubectl label ns gray-release env=canary # 部署Istio(若未安装) istioctl install --set profile=demo -y
-
版本管理规范
bash# Docker镜像标签规则 # 稳定版: v1.2.3-stable # 灰度版: v2.0.0-canary-<commit_hash> # 示例构建命令 docker build -t registry.example.com/app:v2.0.0-canary-$(git rev-parse --short HEAD) . docker push registry.example.com/app:v2.0.0-canary-abc123
二、核心操作步骤
1. 部署新旧版本
yaml
# stable-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: app-v1
namespace: gray-release
spec:
replicas: 10
selector:
matchLabels:
app: myapp
version: v1
template:
metadata:
labels:
app: myapp
version: v1
spec:
containers:
- name: app
image: registry.example.com/app:v1.2.3-stable
# canary-deployment.yaml (与stable版本共享Service)
apiVersion: apps/v1
kind: Deployment
metadata:
name: app-v2
namespace: gray-release
spec:
replicas: 2 # 初始少量副本
selector:
matchLabels:
app: myapp
version: v2
template:
metadata:
labels:
app: myapp
version: v2
spec:
containers:
- name: app
image: registry.example.com/app:v2.0.0-canary-abc123
2. 配置Istio流量规则
bash
# 创建VirtualService和DestinationRule
cat <<EOF | kubectl apply -f -
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: myapp
namespace: gray-release
spec:
hosts:
- "app.example.com"
http:
- route:
- destination:
host: app-v1
weight: 90 # 90%流量走旧版
- destination:
host: app-v2
weight: 10 # 10%流量走新版
---
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
name: myapp
namespace: gray-release
spec:
host: app
subsets:
- name: v1
labels:
version: v1
- name: v2
labels:
version: v2
EOF
3. 渐进式放量(CLI操作)
bash
# 第一步:观察10分钟监控数据
watch -n 5 'kubectl get pods -n gray-release -l app=myapp'
# 第二步:调整流量至30%
istioctl analyze -n gray-release # 检查配置
kubectl patch vs myapp -n gray-release --type=merge \
-p '{"spec":{"http":[{"route":[{"destination":{"host":"app-v1"},"weight":70},{"destination":{"host":"app-v2"},"weight":30}]}]}}'
4. 自动化验证(Prometheus规则示例)
yaml
# prometheus-alert.yaml
- alert: CanaryErrorRateHigh
expr: |
sum(rate(istio_requests_total{
destination_service_name="app-v2",
response_code=~"5.."
}[1m]))
/
sum(rate(istio_requests_total{
destination_service_name="app-v2"
}[1m])) > 0.01 # 错误率>1%触发
for: 3m
labels:
severity: critical
annotations:
summary: "Canary error rate high: {{ $value }}"
action: "kubectl scale deploy/app-v2 --replicas=0 -n gray-release"
5. 全量发布或回滚
bash
# 全量切换(确认正常后)
kubectl set image deploy/app-v1 app=registry.example.com/app:v2.0.0-stable -n gray-release
kubectl delete vs myapp -n gray-release # 删除流量规则
# 紧急回滚(出现问题时)
kubectl scale deploy/app-v2 --replicas=0 -n gray-release && \
kubectl patch vs myapp -n gray-release --type=merge \
-p '{"spec":{"http":[{"route":[{"destination":{"host":"app-v1"},"weight":100}]}]}}'
三、增强型操作技巧
1. 基于Header的精准灰度(测试环境验证)
yaml
# VirtualService补充规则
http:
- match:
- headers:
x-env:
exact: test
route:
- destination:
host: app-v2
- route:
# 默认路由规则...
2. 数据库兼容性处理(MySQL示例)
sql
-- 新版本需兼容旧版Schema
ALTER TABLE orders ADD COLUMN new_feature_flag VARCHAR(32) NULL DEFAULT NULL;
-- 双写模式(代码示例)
func SaveOrder(order Order) error {
// 旧版逻辑
err := oldDB.Save(order)
// 新版逻辑(仅在灰度流量执行)
if isGrayRequest(ctx) {
err = newDB.Save(order.WithNewFields())
}
return err
}
3. 客户端灰度(APP端实现)
java
// Android示例:根据服务端下发放量比例
if (ServerConfig.getGrayPercent() > Math.random()) {
showNewUI();
} else {
showOldUI();
}
四、验证 checklist
-
流量验证
bash# 强制访问新版本(调试用) curl -H "x-env: test" https://app.example.com/api/test
-
数据一致性检查
sql-- 对比新旧版本数据差异 SELECT COUNT(*) FROM orders_v1 UNION ALL SELECT COUNT(*) FROM orders_v2;
-
性能基准测试
bash# 新旧版本压测对比 hey -n 10000 -c 100 -H "x-version: v1" https://app.example.com/api hey -n 10000 -c 100 -H "x-version: v2" https://app.example.com/api
五、生产环境建议
- 发布窗口选择:避开流量高峰(如电商避开大促)
- 关键人员值守:发布期间需有研发+运维+测试同时在线
- 回滚演练:每月至少进行一次模拟回滚训练
通过上述步骤,可在1小时内完成从灰度到全量的安全发布。实际在金融级系统中验证,实现发布期间零客户投诉。