在 Kubernetes 集群中,应用发布通常从一个简单的 Deployment rollingUpdate 开始。
对于早期系统而言,这种方式足够简洁,也能满足基本的无停机升级需求。
但随着系统规模扩大、流量来源复杂化以及业务对稳定性要求的提升,发布过程开始暴露出一系列问题:
-
发布过程中无法精确控制新旧版本的流量比例
-
一旦新版本出现异常,回滚依赖人为判断且不可控
-
发布状态与真实业务流量严重割裂
尤其是在需要灰度验证、手动放量、快速回退的生产环境中,原生 Deployment 已经难以支撑精细化的发布策略。
为了解决这些问题,本文将基于 Argo Rollouts ,结合实际集群环境,构建一套可控、可回滚、可观测的 Kubernetes 应用发布实践,并逐步演进到支持 Canary 与 BlueGreen 的发布模型。
在学习之前我们有必要简单的搞清楚什么是ArgoCD与ArgoRollouts.
一、ArgoCD
定位:基于 GitOps 的 Kubernetes 应用部署管理工具
核心思想 :声明式、自动化、可追溯
关键功能:
GitOps 工作流
将 Kubernetes 配置(YAML/ Helm Chart)存储在 Git 仓库中。
ArgoCD 自动同步 Git 仓库中的配置到集群,保持集群状态与 Git 一致。
无需手动
kubectl apply,所有变更通过 Git Pull Request 管理。自动同步与回滚
监控 Git 仓库变化,自动触发同步(支持实时/手动触发)。
任何配置变更均可一键回滚到历史版本(基于 Git commit)。
可视化管理
提供 Web UI,直观展示集群状态、同步状态、差异对比。
支持多环境(Dev/Staging/Prod)管理。
多集群支持
- 通过
argocd cluster add管理多个 Kubernetes 集群。关键组件
ArgoCD CLI:命令行工具(argocd app list/argocd app sync)。
ApplicationCRD:定义应用(Git 仓库地址、目标集群、同步策略)。
Sync Wave:控制资源同步顺序(如先部署 ConfigMap 再部署 Pod)。
核心资源对象Application
Application = Git 仓库中的"期望状态" + 集群中的"实际状态"
| 字段 | 层级 | 作用说明 | 生产要点 |
|---|---|---|---|
spec.project |
spec | 绑定 AppProject | 决定权限、目标集群、命名空间 |
spec.source.repoURL |
spec | Git 仓库地址 | 建议只读凭据 |
spec.source.path |
spec | 清单路径 | 常配合 kustomize/helm |
spec.source.targetRevision |
spec | 分支 / tag / commit | 生产通常锁定 tag |
spec.destination.server |
spec | 目标集群 | 多集群 GitOps 核心 |
spec.destination.namespace |
spec | 目标命名空间 | 与 Project 强关联 |
spec.syncPolicy.automated |
spec | 自动同步 | 灰度发布阶段通常关闭 |
spec.syncPolicy.syncOptions |
spec | 同步行为 | CreateNamespace=true 常用 |
spec.ignoreDifferences |
spec | 忽略字段漂移 | Rollout/HPAs 常用 |
status.sync.status |
status | Sync 状态 | Synced / OutOfSync |
status.health.status |
status | 健康状态 | 不等同于业务健康 |
核心资源对象AppProject
应用边界与权限控制AppProject
| 字段 | 作用说明 | 生产价值 |
|---|---|---|
spec.destinations |
允许的集群/命名空间 | 防止越权部署 |
spec.sourceRepos |
允许的 Git 仓库 | Git 安全边界 |
spec.clusterResourceWhitelist |
允许的集群级资源 | 控制 CRD 使用 |
spec.namespaceResourceWhitelist |
允许的命名空间资源 | Rollout / Gateway |
spec.roles |
项目级 RBAC | 多团队协作 |
笔者之前写过一篇简单的ArgoCD入门实战,可参考:
二、ArgoRollouts
定位 :ArgoCD 的渐进式交付扩展 ,用于实现 金丝雀发布、蓝绿部署 等高级发布策略
核心思想 :安全、可控、可观察的发布
关键功能:
渐进式发布策略
金丝雀发布:逐步将流量从旧版本切换到新版本(如 5% → 25% → 100%)。
蓝绿部署:新版本先部署到备用环境,验证后切换流量。
与 ArgoCD 集成
ArgoRollouts 通过 Rollout CRD 替代标准 Deployment。
ArgoCD 会自动将 Rollout 配置同步到集群,无需额外配置。
关键特性
手动批准:在每个发布阶段暂停,等待人工确认。
自动回滚:若监控指标异常(如错误率 > 1%),自动回退到上一版本。
流量切分 :通过
traffic字段控制流量比例(如traffic: 0.5表示 50% 流量)。
核心资源对象Rollout
Rollout = 发布状态机 + 流量控制抽象 + Deployment增强版
| 字段 | 层级 | 作用说明 | 生产要点 |
|---|---|---|---|
spec.replicas |
spec | 期望副本数 | 与 HPA 协同 |
spec.selector |
spec | Pod 选择器 | 不可变字段 |
spec.template |
spec | Pod 模板 | 与 Deployment 等价 |
spec.revisionHistoryLimit |
spec | 保留历史版本 | 回滚能力保障 |
status.currentPodHash |
status | 当前版本标识 | 调试关键 |
status.pauseConditions |
status | 暂停原因 | 灰度卡点排查 |
核心资源对象BlueGreen
| 字段 | 作用说明 | 生产行为语义 |
|---|---|---|
strategy.blueGreen.activeService |
当前对外服务 | 线上真实流量 |
strategy.blueGreen.previewService |
预览服务 | 灰度/验证 |
strategy.blueGreen.autoPromotionEnabled |
是否自动切换 | 生产通常 false |
strategy.blueGreen.scaleDownDelaySeconds |
延迟缩容旧版本 | 回滚窗口 |
strategy.blueGreen.previewReplicaCount |
预览副本数 | 资源控制 |
strategy.blueGreen.prePromotionAnalysis |
切换前分析 | 发布前阻断 |
strategy.blueGreen.postPromotionAnalysis |
切换后分析 | 灰度放量保障 |
核心资源对象Canary
| 字段 | 作用说明 | 生产行为语义 |
|---|---|---|
strategy.canary.stableService |
稳定版本 Service | 基准流量 |
strategy.canary.canaryService |
金丝雀 Service | 新版本流量 |
strategy.canary.steps |
灰度步骤 | 发布节奏控制 |
strategy.canary.steps.setWeight |
流量比例 | 依赖网关实现 |
strategy.canary.steps.pause |
人工/自动暂停 | 验证窗口 |
strategy.canary.analysis |
分析规则 | 自动 abort |
strategy.canary.trafficRouting |
流量路由 | Istio / Gateway API |
核心资源对象AnalysisTemplate
| 字段 | 作用说明 | 生产价值 |
|---|---|---|
spec.metrics |
指标定义 | 成败标准 |
metrics.interval |
采样周期 | 实时性 |
metrics.failureLimit |
失败阈值 | 自动回滚 |
metrics.successCondition |
成功表达式 | 发布判定逻辑 |
provider.prometheus |
指标源 | 最常见 |
provider.web |
HTTP 检查 | 网关场景友好 |
三、ArgoCD vs ArgoRollouts 关系
ArgoCD 做"配置同步",ArgoRollouts 做"安全发布" ,两者配合实现 GitOps + 渐进式交付 的完整流水线。
| 特性 | ArgoCD | ArgoRollouts |
|---|---|---|
| 核心目标 | GitOps 部署管理 | 渐进式发布策略 |
| 依赖关系 | 独立工具 | 可以配合ArgoCD(通过 CRD 集成) |
| 关键资源 | Application CRD |
Rollout CRD |
| 解决的问题 | "配置如何同步到集群" | "如何安全地发布新版本" |
| 是否需要安装 | 需单独安装 | 需单独安装 |
Argo CD 解决的是"交付一致性问题",而 Argo Rollouts 解决的是"发布风险问题"。
前者保证部署可重复,后者保证发布可控。
四、部署实战
1. 了解ArgoCD
官方资源地址
Argo CD - Declarative GitOps CD for Kubernetes
ArgoCD基础可参考笔者的文章
ArgoCD 与 GitOps:K8S 原生持续部署的实操指南

2.部署ArgoRollouts
Argo Rollouts - Kubernetes Progressive Delivery Controller
bash
# 创建名称空间
kubectl create namespace argo-rollouts
# 将核心应用的资源清单部署到argo-rollouts的名称空间下
kubectl apply -n argo-rollouts -f https://github.com/argoproj/argo-rollouts/releases/latest/download/install.yaml
# 将ArgoRollouts的WebUI界面部署到argo-rollouts的名称空间下
kubectl apply -n argo-rollouts -f https://github.com/argoproj/argo-rollouts/releases/download/v1.8.3/dashboard-install.yaml
bash
[root@k8s-master ~/argo-cd-rollouts]# kubectl get po,svc -n argo-rollouts
NAME READY STATUS RESTARTS AGE
pod/argo-rollouts-cc488fffb-nb9ch 1/1 Running 1 (12m ago) 4h26m
pod/argo-rollouts-dashboard-75b5d9fcd9-cfkp6 1/1 Running 1 (12m ago) 3h57m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/argo-rollouts-dashboard NodePort 10.110.77.45 <none> 3100:31074/TCP 3h57m
service/argo-rollouts-metrics ClusterIP 10.106.11.224 <none> 8090/TCP 4h26m
访问ArgoRollouts的WebUI界面

此处Loading为正常现象,因为还没有任何的Rollouts资源创建。
五、ArgoRollouts蓝绿发布
环境准备
在 myweb 命名空间中定义了两个关联 app: myapp 标签应用的服务,分别是作为生产流量入口的 LoadBalancer 类型服务 myapp-svc、作为预览流量入口的 ClusterIP 类型服务 myapp-preview-svc,二者均实现 80 端口到应用容器 80 端口的流量转发,用于配合Argo Rollouts 蓝绿发布。
bash
[root@k8s-master ~/argorollouts]# cat myapp-services.yaml
apiVersion: v1
kind: Service
metadata:
name: myapp-svc
namespace: myweb
spec:
type: LoadBalancer
selector:
app: myapp
ports:
- port: 80
targetPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: myapp-preview-svc
namespace: myweb
spec:
type: ClusterIP
selector:
app: myapp
ports:
- port: 80
targetPort: 80
在 myweb 命名空间中定义了名为 myapp、1 个副本的应用,采用禁用自动晋升的蓝绿发布策略 ,关联生产服务 myapp-svc 和预览服务 myapp-preview-svc,容器镜像为 myweb:v1 并暴露 80 端口。
bash
[root@k8s-master ~/argorollouts]# cat myapp-rollout.yaml
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: myapp
namespace: myweb
spec:
replicas: 1
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
spec:
containers:
- name: myapp
image: myweb:v1
imagePullPolicy: IfNotPresent
ports:
- containerPort: 80
strategy:
blueGreen:
activeService: myapp-svc
previewService: myapp-preview-svc
autoPromotionEnabled: false
apply应用起来
bash
[root@k8s-master ~/argorollouts]# kubectl get po,svc -n myweb
NAME READY STATUS RESTARTS AGE
pod/myapp-759958d76c-2vs8h 1/1 Running 0 95s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/myapp-preview-svc ClusterIP 10.101.184.143 <none> 80/TCP 101s
service/myapp-svc LoadBalancer 10.96.253.118 10.0.0.151 80:31804/TCP 101s
访问测试没有问题

然后再回到Rollouts的UI界面刷新
注意右上角要选择正确的名称空间。

可以看到当前的stable的版本是myweb:v1版本

点击右上角的Edit添加一个myweb:v2的版本

点击save创建出来就可以看到一个Preview的版本

bash
[root@k8s-master ~/argorollouts]# kubectl get po,svc -n myweb
NAME READY STATUS RESTARTS AGE
pod/myapp-759958d76c-2vs8h 1/1 Running 0 5m41s
pod/myapp-bf99b76d8-2wpj6 1/1 Running 0 79s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/myapp-preview-svc ClusterIP 10.101.184.143 <none> 80/TCP 5m47s
service/myapp-svc LoadBalancer 10.96.253.118 10.0.0.151 80:31804/TCP 5m47s
查看一下现在的rollouts资源和对应的规则
bash
[root@k8s-master ~/argorollouts]# kubectl argo rollouts get rollout myapp -n myweb
Name: myapp
Namespace: myweb
Status: ॥ Paused
Message: BlueGreenPause
Strategy: BlueGreen
Images: myweb:v1 (stable, active)
myweb:v2 (preview)
Replicas:
Desired: 1
Current: 2
Updated: 1
Ready: 1
Available: 1
NAME KIND STATUS AGE INFO
⟳ myapp Rollout ॥ Paused 6m32s
├──# revision:2
│ └──⧉ myapp-bf99b76d8 ReplicaSet ✔ Healthy 2m10s preview
│ └──□ myapp-bf99b76d8-2wpj6 Pod ✔ Running 2m10s ready:1/1
└──# revision:1
└──⧉ myapp-759958d76c ReplicaSet ✔ Healthy 6m32s stable,active
└──□ myapp-759958d76c-2vs8h Pod ✔ Running 6m32s ready:1/1
那我们现在就切换一下流量
命令行切换版本
bash
[root@k8s-master ~/argorollouts]# curl 10.0.0.151
vamos | This version is v1 | v111111
[root@k8s-master ~/argorollouts]# kubectl argo rollouts promote myapp -n myweb
rollout 'myapp' promoted
bash
[root@k8s-master ~/argorollouts]# kubectl argo rollouts get rollout myapp -n myweb
Name: myapp
Namespace: myweb
Status: ✔ Healthy
Strategy: BlueGreen
Images: myweb:v1
myweb:v2 (stable, active)
Replicas:
Desired: 1
Current: 2
Updated: 1
Ready: 1
Available: 1
NAME KIND STATUS AGE INFO
⟳ myapp Rollout ✔ Healthy 8m27s
├──# revision:2
│ └──⧉ myapp-bf99b76d8 ReplicaSet ✔ Healthy 4m5s stable,active
│ └──□ myapp-bf99b76d8-2wpj6 Pod ✔ Running 4m5s ready:1/1
└──# revision:1
└──⧉ myapp-759958d76c ReplicaSet ✔ Healthy 8m27s delay:13s
└──□ myapp-759958d76c-2vs8h Pod ✔ Running 8m27s ready:1/1
bash
[root@k8s-master ~/argorollouts]# kubectl get po,svc -n myweb
NAME READY STATUS RESTARTS AGE
pod/myapp-bf99b76d8-2wpj6 1/1 Running 0 4m28s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/myapp-preview-svc ClusterIP 10.101.184.143 <none> 80/TCP 8m56s
service/myapp-svc LoadBalancer 10.96.253.118 10.0.0.151 80:31804/TCP 8m56s
[root@k8s-master ~/argorollouts]# curl 10.0.0.151
vamos | This version is v2 | v222222

promote 之后,v1 Pod 被删除,是 Argo Rollouts 的默认蓝绿收敛行为。旧版本(v1)已经不再承担任何流量,也不再需要保留副本。
在生产环境中:
-
旧版本 Pod 必须尽快释放
-
避免资源浪费
-
避免误流量

如何保留v1版本暂时不消失
设置 scaleDownDelaySeconds
bash
strategy:
blueGreen:
activeService: myapp-svc
previewService: myapp-preview-svc
autoPromotionEnabled: false
scaleDownDelaySeconds: 300 # 保留旧版本 5 分钟
回滚操作
bash
kubectl argo rollouts undo myapp -n myweb
Rollouts 会:
-
使用
revision:1的 ReplicaSet -
重新创建 v1 Pod
不依赖旧 Pod 是否存在。
bash
[root@k8s-master ~/argorollouts]# kubectl argo rollouts get rollout myapp -n myweb
Name: myapp
Namespace: myweb
Status: ॥ Paused
Message: BlueGreenPause
Strategy: BlueGreen
Images: myweb:v1 (preview)
myweb:v2 (stable, active)
Replicas:
Desired: 1
Current: 2
Updated: 1
Ready: 1
Available: 1
NAME KIND STATUS AGE INFO
⟳ myapp Rollout ॥ Paused 12m
├──# revision:3
│ └──⧉ myapp-759958d76c ReplicaSet ✔ Healthy 12m preview
│ └──□ myapp-759958d76c-zdkcb Pod ✔ Running 39s ready:1/1
└──# revision:2
└──⧉ myapp-bf99b76d8 ReplicaSet ✔ Healthy 8m29s stable,active
└──□ myapp-bf99b76d8-2wpj6 Pod ✔ Running 8m29s ready:1/1
再次切换流量到v1版本
bash
kubectl argo rollouts promote myapp -n myweb
把 myapp-svc 指回 v1
bash
[root@k8s-master ~/argorollouts]# kubectl argo rollouts get rollout myapp -n myweb
Name: myapp
Namespace: myweb
Status: ✔ Healthy
Strategy: BlueGreen
Images: myweb:v1 (stable, active)
myweb:v2
Replicas:
Desired: 1
Current: 2
Updated: 1
Ready: 1
Available: 1
NAME KIND STATUS AGE INFO
⟳ myapp Rollout ✔ Healthy 14m
├──# revision:3
│ └──⧉ myapp-759958d76c ReplicaSet ✔ Healthy 14m stable,active
│ └──□ myapp-759958d76c-zdkcb Pod ✔ Running 2m44s ready:1/1
└──# revision:2
└──⧉ myapp-bf99b76d8 ReplicaSet ✔ Healthy 10m delay:2s
└──□ myapp-bf99b76d8-2wpj6 Pod ✔ Running 10m ready:1/1
bash
[root@k8s-master ~/argorollouts]# curl 10.0.0.151
vamos | This version is v1 | v111111

图形化操作
新增Edit一个版本假如说是V3

切换流量,默认会等待30秒切换,在30秒内你还可以rollback回去



切换回去点击RollBack

然后Promote切换版本


六、灰度发布(金丝雀发布)
环境准备
在 myweb 命名空间中定义了名为 myapp、3 个副本的应用,容器镜像为 myweb:v1 并暴露 80 端口,采用分阶段手动确认的金丝雀发布策略,流量将按「25% → 手动暂停确认 → 50% → 手动暂停确认 → 100%」的步骤逐步切换完成发布。
bash
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: myapp
namespace: myweb
spec:
replicas: 3
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
spec:
containers:
- name: myapp
image: myweb:v1
imagePullPolicy: IfNotPresent
ports:
- containerPort: 80
strategy:
canary:
steps:
- setWeight: 25
- pause: {}
- setWeight: 50
- pause: {}
- setWeight: 100
在 myweb 命名空间中定义了名为 myapp-svc 的 LoadBalancer 类型服务,通过 app: myapp 标签关联对应应用实例,实现 80 端口的外部流量转发至后端应用容器的 80 端口,作为该应用(尤其是配合此前金丝雀发布)的外部生产流量入口。
bash
apiVersion: v1
kind: Service
metadata:
name: myapp-svc
namespace: myweb
spec:
type: LoadBalancer
selector:
app: myapp
ports:
- port: 80
targetPort: 80
bash
[root@k8s-master ~/argorollouts]# kubectl get po,svc -n myweb
NAME READY STATUS RESTARTS AGE
pod/myapp-759958d76c-fd64q 1/1 Running 0 28s
pod/myapp-759958d76c-stw65 1/1 Running 0 40s
pod/myapp-759958d76c-zcx5w 1/1 Running 0 32s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/myapp-svc LoadBalancer 10.109.126.253 10.0.0.151 80:32486/TCP 7m5s
图形化玩法
我们新Edit一个V2版本成为我们的金丝雀,当前是存在25%的流量流量到v2版本的.

点击一下Promote就会变成50%


bash
[root@k8s-master ~/argorollouts]# for i in {1..10}; do curl 10.0.0.151; done
vamos | This version is v2 | v222222
vamos | This version is v2 | v222222
vamos | This version is v2 | v222222
vamos | This version is v2 | v222222
vamos | This version is v1 | v111111
vamos | This version is v2 | v222222
vamos | This version is v1 | v111111
vamos | This version is v1 | v111111
vamos | This version is v1 | v111111
vamos | This version is v2 | v222222
再点击一下就会变成100%,同时旧的v1版本也会被释放

bash
[root@k8s-master ~/argorollouts]# for i in {1..10}; do curl 10.0.0.151; done
vamos | This version is v2 | v222222
vamos | This version is v2 | v222222
vamos | This version is v2 | v222222
vamos | This version is v2 | v222222
vamos | This version is v2 | v222222
vamos | This version is v2 | v222222
vamos | This version is v2 | v222222
vamos | This version is v2 | v222222
vamos | This version is v2 | v222222
vamos | This version is v2 | v222222
回滚操作点击v1版本的rollback,可以看到现在v1变成金丝雀了。

然后继续点击Promoteji就变成50%

再点击一次就完全回滚到v1的版本了

bash
[root@k8s-master ~/argorollouts]# kubectl argo rollouts get rollout myapp -n myweb
Name: myapp
Namespace: myweb
Status: ✔ Healthy
Strategy: Canary
Step: 5/5
SetWeight: 100
ActualWeight: 100
Images: myweb:v1 (stable)
Replicas:
Desired: 3
Current: 3
Updated: 3
Ready: 3
Available: 3
NAME KIND STATUS AGE INFO
⟳ myapp Rollout ✔ Healthy 12m
├──# revision:7
│ └──⧉ myapp-759958d76c ReplicaSet ✔ Healthy 12m stable
│ ├──□ myapp-759958d76c-ktb56 Pod ✔ Running 118s ready:1/1
│ ├──□ myapp-759958d76c-lx8g6 Pod ✔ Running 82s ready:1/1
│ └──□ myapp-759958d76c-kpsf7 Pod ✔ Running 56s ready:1/1
└──# revision:6
└──⧉ myapp-bf99b76d8 ReplicaSet • ScaledDown 10m
命令行玩法
bash
[root@k8s-master ~/argorollouts]# kubectl get po,svc -n myweb
NAME READY STATUS RESTARTS AGE
pod/myapp-759958d76c-kpsf7 1/1 Running 0 99s
pod/myapp-759958d76c-ktb56 1/1 Running 0 2m41s
pod/myapp-759958d76c-lx8g6 1/1 Running 0 2m5s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/myapp-svc LoadBalancer 10.109.126.253 10.0.0.151 80:32486/TCP 12m
bash
[root@k8s-master ~/argorollouts]# kubectl argo rollouts set image myapp \
myapp=myweb:v2 \
-n myweb
rollout "myapp" image updated
此时发生了什么?
Rollouts 会:
-
创建一个新的 ReplicaSet(v2)
-
按
setWeight: 25启动 1 个 v2 Pod -
同时保留 3 个 v1 Pod
-
自动进入
pause
bash
[root@k8s-master ~/argorollouts]# kubectl get po,svc -n myweb
NAME READY STATUS RESTARTS AGE
pod/myapp-759958d76c-kpsf7 1/1 Running 0 2m15s
pod/myapp-759958d76c-ktb56 1/1 Running 0 3m17s
pod/myapp-759958d76c-lx8g6 1/1 Running 0 2m41s
pod/myapp-bf99b76d8-8cvtf 1/1 Running 0 27s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/myapp-svc LoadBalancer 10.109.126.253 10.0.0.151 80:32486/TCP 13m
推进灰度
bash
[root@k8s-master ~/argorollouts]# kubectl argo rollouts promote myapp -n myweb
rollout 'myapp' promoted
效果:
-
进入
setWeight: 50 -
v2 Pod = 2
-
v1 Pod = 2
-
再次暂停
全量发布
bash
kubectl argo rollouts promote myapp -n myweb
最终:
-
所有 Pod =
myweb:v2 -
v1 ReplicaSet 被 scale down
-
Rollout 状态:
Healthy
bash
[root@k8s-master ~/argorollouts]# kubectl get po,svc -n myweb
NAME READY STATUS RESTARTS AGE
pod/myapp-bf99b76d8-2c5vc 1/1 Running 0 25s
pod/myapp-bf99b76d8-8cvtf 1/1 Running 0 91s
pod/myapp-bf99b76d8-8k2l9 1/1 Running 0 43s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/myapp-svc LoadBalancer 10.109.126.253 10.0.0.151 80:32486/TCP 14m
bash
[root@k8s-master ~/argorollouts]# curl 10.0.0.151
vamos | This version is v2 | v222222
如果中途发现问题怎么办?
命令立即回滚
bash
[root@k8s-master ~/argorollouts]# kubectl argo rollouts undo myapp -n myweb
INFO[0000] unknown field "spec.template.metadata.creationTimestamp"
rollout 'myapp' undo
然后切换流量
bash
kubectl argo rollouts promote myapp -n myweb
需要切换两次
bash
[root@k8s-master ~/argorollouts]# for i in `seq 10 `; do curl 10.0.0.151;done
vamos | This version is v1 | v111111
vamos | This version is v1 | v111111
vamos | This version is v2 | v222222
vamos | This version is v1 | v111111
vamos | This version is v1 | v111111
vamos | This version is v2 | v222222
vamos | This version is v2 | v222222
vamos | This version is v2 | v222222
vamos | This version is v2 | v222222
vamos | This version is v2 | v222222
[root@k8s-master ~/argorollouts]# for i in `seq 10 `; do curl 10.0.0.151;done
vamos | This version is v1 | v111111
vamos | This version is v1 | v111111
vamos | This version is v1 | v111111
vamos | This version is v1 | v111111
vamos | This version is v1 | v111111
vamos | This version is v1 | v111111
vamos | This version is v1 | v111111
vamos | This version is v1 | v111111
vamos | This version is v1 | v111111
vamos | This version is v1 | v111111
bash
[root@k8s-master ~/argorollouts]# kubectl get po,svc -n myweb
NAME READY STATUS RESTARTS AGE
pod/myapp-759958d76c-8wwhg 1/1 Running 0 53s
pod/myapp-759958d76c-mpgnq 1/1 Running 0 2m57s
pod/myapp-759958d76c-q88p8 1/1 Running 0 63s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/myapp-svc LoadBalancer 10.109.126.253 10.0.0.151 80:32486/TCP 18m
七、Gateway-API精准的流量控制
Envoy Gateway 对外入口
Gateway
bash
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: myweb-gateway
namespace: myweb
spec:
gatewayClassName: envoy-gateway
listeners:
- name: http
protocol: HTTP
port: 80
| 字段 | 作用 |
|---|---|
| gatewayClassName | 指定使用 Envoy Gateway |
| listeners.port | 对外暴露端口 |
| listeners.protocol | L7 HTTP |
BlueGreen 发布
Active Service
bash
apiVersion: v1
kind: Service
metadata:
name: myapp-svc
namespace: myweb
spec:
type: ClusterIP
selector:
app: myapp
ports:
- port: 80
targetPort: 80
| 字段 | 作用 |
|---|---|
| name | BlueGreen activeService |
| selector | 由 Rollout 控制 Pod 指向 |
| type | Gateway 场景使用 ClusterIP |
Preview Service
bash
apiVersion: v1
kind: Service
metadata:
name: myapp-preview-svc
namespace: myweb
spec:
type: ClusterIP
selector:
app: myapp
ports:
- port: 80
targetPort: 80
| 字段 | 作用 |
|---|---|
| name | BlueGreen previewService |
| selector | 指向 preview Pod |
HTTPRoute
bash
apiVersion: gateway.networking.k8s.io/v1beta1
kind: HTTPRoute
metadata:
name: myapp-bg-route
namespace: myweb
spec:
parentRefs:
- name: myweb-gateway
namespace: myweb
rules:
- matches:
- path:
type: PathPrefix
value: /
backendRefs:
- name: myapp-svc
port: 80
weight: 100
- name: myapp-preview-svc
port: 80
weight: 0
| 字段 | 作用 |
|---|---|
| parentRefs | 绑定 Gateway |
| backendRefs.weight | Argo Rollouts 自动修改 |
| preview weight | promote 前为 0 |
Rollout
bash
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: myapp-bluegreen
namespace: myweb
spec:
replicas: 1
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
spec:
containers:
- name: myapp
image: myweb:v1
ports:
- containerPort: 80
strategy:
blueGreen:
activeService: myapp-svc
previewService: myapp-preview-svc
autoPromotionEnabled: false
scaleDownDelaySeconds: 300
trafficRouting:
plugins:
argoproj-labs/gatewayAPI:
httpRoute: myapp-bg-route
namespace: myweb
| 字段 | 作用 |
|---|---|
| autoPromotionEnabled | false = 手动切流 |
| scaleDownDelaySeconds | 保留旧版本 Pod |
| trafficRouting | 允许 Rollouts 修改 HTTPRoute |
BlueGreen 操作步骤
kubectl apply -f .
kubectl set image rollout/myapp-bluegreen myapp=myweb:v2 -n myweb
kubectl argo rollouts get rollout myapp-bluegreen -n myweb --watch
kubectl argo rollouts promote myapp-bluegreen -n myweb
kubectl argo rollouts undo myapp-bluegreen -n myweb
kubectl argo rollouts promote myapp-bluegreen -n myweb
Canary 灰度发布
Stable Service
bash
apiVersion: v1
kind: Service
metadata:
name: myapp-stable
namespace: myweb
spec:
type: ClusterIP
selector:
app: myapp
ports:
- port: 80
targetPort: 80
Canary Service
bash
apiVersion: v1
kind: Service
metadata:
name: myapp-canary
namespace: myweb
spec:
type: ClusterIP
selector:
app: myapp
ports:
- port: 80
targetPort: 80
HTTPRoute
bash
apiVersion: gateway.networking.k8s.io/v1beta1
kind: HTTPRoute
metadata:
name: myapp-canary-route
namespace: myweb
spec:
parentRefs:
- name: myweb-gateway
namespace: myweb
rules:
- matches:
- path:
type: PathPrefix
value: /
backendRefs:
- name: myapp-stable
port: 80
weight: 100
- name: myapp-canary
port: 80
weight: 0
Rollout
bash
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: myapp-canary
namespace: myweb
spec:
replicas: 4
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
spec:
containers:
- name: myapp
image: myweb:v1
ports:
- containerPort: 80
strategy:
canary:
stableService: myapp-stable
canaryService: myapp-canary
trafficRouting:
plugins:
argoproj-labs/gatewayAPI:
httpRoute: myapp-canary-route
namespace: myweb
steps:
- setWeight: 10
- pause:
duration: 2m
- setWeight: 30
- pause:
duration: 2m
- setWeight: 60
- pause:
duration: 2m
- setWeight: 100
Canary 操作步骤
kubectl apply -f .
kubectl set image rollout/myapp-canary myapp=myweb:v2 -n myweb
kubectl argo rollouts get rollout myapp-canary -n myweb --watch
kubectl argo rollouts undo myapp-canary -n myweb
完整流量路径
bash
┌──────────────────────────────────────────┐
│ External Client │
│ (Browser / curl / LB) │
└──────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────┐
│ Envoy Gateway │
│ (Deployment / DaemonSet, L7 Proxy) │
│ GatewayClass: envoy-gateway │
└──────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────┐
│ Gateway API │
│ HTTPRoute (L7) │
│ │
│ backendRefs: │
│ - myapp-svc weight: 70 │
│ - myapp-preview weight: 30 │
│ │
│ (weight 由 Argo Rollouts 实时修改) │
└──────────────────────────────────────────┘
│
┌──────────┴──────────┐
▼ ▼
┌───────────────────┐ ┌───────────────────┐
│ Kubernetes Service │ │ Kubernetes Service │
│ stable / active │ │ preview / canary │
│ ClusterIP │ │ ClusterIP │
└───────────────────┘ └───────────────────┘
│ │
▼ ▼
┌───────────────────┐ ┌───────────────────┐
│ Pod (Replica) │ │ Pod (Replica) │
│ Rollout Revision │ │ Rollout Revision │
│ image: myweb:v1 │ │ image: myweb:v2 │
└───────────────────┘ └───────────────────┘
▲ ▲
└──────────┬──────────┘
│
┌──────────────────────────────────┐
│ Argo Rollouts Controller │
│ │
│ - 创建/缩容 ReplicaSet │
│ - 修改 HTTPRoute 权重 │
│ - 执行 Analysis / Abort │
└──────────────────────────────────┘
八、总结
在 Kubernetes 中,Deployment 只能解决"如何把新版本跑起来",却无法解决"是否应该让新版本接管流量" 。
这正是 Argo Rollouts 的价值所在。
通过将发布过程建模为可暂停、可回滚的状态机,Argo Rollouts 让版本发布从一次不可控的变更,变成了一个可观测、可干预、可终止的工程流程。
-
BlueGreen 提供了强确定性的版本切换与低成本回滚,适合核心业务和高风险发布
-
Canary 通过分阶段放量,将发布决策建立在真实流量验证之上
结合 Gateway API 与 Envoy Gateway,流量控制从 Pod 层提升到请求层,发布节奏与流量分配实现了解耦,发布行为也因此具备了更高的精度和确定性。
在实践中,Argo CD 负责"状态交付",Argo Rollouts 负责"发布控制",二者共同构成了一套可持续演进的 Kubernetes 应用交付体系。
发布不再只是升级镜像,而是一项可以被严格设计和管理的工程能力。