Kubernetes网络性能测试

Kubernetes 网络性能测试

基于已经搭建的Kubernetes环境,来测试其网络性能。

1. 测试准备

1.1 测试环境

测试环境为VMware Workstation虚拟机搭建的一套K8S环境,版本为1.19,网络插件使用flannel。

hostname ip 备注
k8s-master 192.168.0.51 master
k8s-node1 192.168.0.52 worker
k8s-node2 192.168.0.53 worker

已经部署测试应用sample-webapp,三个副本:

shell 复制代码
[root@k8s-master ~]# kubectl get pod -n ingress-traefik | grep sample-webapp
sample-webapp-4wf7c                1/1     Running   0          46s
sample-webapp-jvdpv                1/1     Running   10         7d23h
sample-webapp-kdk9k                1/1     Running   0          46s

[root@k8s-master ~]# kubectl get svc -n ingress-traefik | grep sample-webapp
sample-webapp   ClusterIP   10.98.210.117   <none>        8000/TCP                     10d

备注:

本文测试环境,master节点取消了NoSchedule的Taints,使得master节点也可以调度业务pod。

1.2 测试场景

  • Kubernetes 集群 node 节点上通过 Cluster IP 方式访问
  • Kubernetes 集群内部通过 service 访问
  • Kubernetes 集群外部通过 traefik ingress 暴露的地址访问

1.3 测试工具

  • Locust:一个简单易用的用户负载测试工具,用来测试 web 或其他系统能够同时处理的并发用户数。
  • curl
  • 测试程序:sample-webapp,源码见Github kubernetes 的分布式负载测试

1.4 测试说明

通过向 sample-webapp 发送 curl 请求获取响应时间,直接 curl 后的结果为:

shell 复制代码
[root@k8s-master ~]# curl "http://10.98.210.117:8000"
Welcome to the "Distributed Load Testing Using Kubernetes" sample web app

2. 网络延迟测试

2.1 Kubernetes 集群 node 节点上通过 Cluster IP 访问

测试命令

curl -o /dev/null -s -w '%{time_connect} %{time_starttransfer} %{time_total}' "http://10.98.210.117:8000/"

测试10组数据,取平均结果:

shell 复制代码
[root@k8s-node1 ~]# echo "time_connect  time_starttransfer time_total"; for i in {1..10}; do curl -o /dev/null -s -w '%{time_connect} %{time_starttransfer} %{time_total}\n' "http://10.98.210.117:8000/"; done
time_connect  time_starttransfer time_total
0.000 0.002 0.002
0.000 0.001 0.001
0.001 0.003 0.003
0.001 0.003 0.003
0.001 0.006 0.006
0.001 0.003 0.003
0.001 0.004 0.004
0.001 0.003 0.003
0.001 0.004 0.004
0.000 0.002 0.002

平均响应时间:3.1 ms

指标说明:

  • time_connect:建立到服务器的 TCP 连接所用的时间

  • time_starttransfer:在发出请求之后,Web 服务器返回数据的第一个字节所用的时间

  • time_total:完成请求所用的时间

2.2 Kubernetes 集群内部通过 service 访问

shell 复制代码
# 进入测试的客户端
[root@k8s-node1 ~]# kubectl exec -it locust-master-vljrx -n ingress-traefik /bin/bash
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
# 执行测试
root@locust-master-vljrx:/# echo "time_connect  time_starttransfer time_total"; for i in {1..10}; do curl -o /dev/null -s -w '%{time_connect} %{time_starttransfer} %{time_total}\n' "http://sample-webapp:8000/"; done
time_connect  time_starttransfer time_total
0.001326 0.002542 0.002715
0.001444 0.003264 0.003675
0.002193 0.004262 0.004889
0.002066 0.003664 0.003876
0.001739 0.004095 0.004432
0.002339 0.004536 0.004647
0.001649 0.003288 0.003628
0.001794 0.003373 0.003911
0.001492 0.003201 0.003581
0.002036 0.003712 0.004109

平均响应时间:约4ms

2.3 在外部通过 traefik ingress 访问

本文使用了traefik作为k8s集群的网关入口,可以参考我的另外一篇文章部署traefik2.0。也可以将service改为NodePort类型进行测试。

shell 复制代码
# 添加路由规则,使得访问sample-webapp.test.com可以路由到sample-webapp服务
[root@k8s-master ~]# kubectl get ingressroute -n ingress-traefik traefik-dashboard-route -o yaml
...
resourceVersion: "975669"
  selfLink: /apis/traefik.containo.us/v1alpha1/namespaces/ingress-traefik/ingressroutes/traefik-dashboard-route
  uid: 4faeecab-cd87-406f-9d50-3d507a6b73ff
spec:
  entryPoints:
  - web
  routes:
  - kind: Rule
    match: Host(`traefik.test.com`)
    services:
    - name: traefik
      port: 8080
  - kind: Rule
    match: Host(`locust.test.com`)
    services:
    - name: locust-master
      port: 8089
  - kind: Rule
    match: Host(`sample-webapp.test.com`)
    services:
    - name: sample-webapp
      port: 8000

在外部的客户端添加域名解析后,进行访问测试:

shell 复制代码
[root@bogon ~]# echo "time_connect  time_starttransfer time_total"; for i in {1..10}; do curl -o /dev/null -s -w '%{time_connect} %{time_starttransfer} %{time_total}\n' "http://sample-webapp.test.com/"; done
time_connect  time_starttransfer time_total
0.048 0.056 0.062
0.030 0.108 0.115
0.029 0.036 0.046
0.048 0.111 0.119
0.021 0.030 0.030
0.017 0.022 0.022
0.025 0.031 0.036
0.021 0.028 0.028
0.039 0.045 0.045
0.020 0.025 0.029

平均响应时间:53.2ms

2.4 测试结果

在这三种场景下的响应时间测试结果如下:

Kubernetes 集群 node 节点上通过 Cluster IP 方式访问:3.1 ms

Kubernetes 集群内部通过 service 访问:4 ms

Kubernetes 集群外部通过 traefik ingress 暴露的地址访问:53.2 ms

说明:

  1. 执行测试的 node 节点 / Pod 与 serivce 所在的 pod 的距离(是否在同一台主机上),对前两个场景可以能会有一定影响。
  2. 测试结果仅作参考,与具体的资源配置、网络环境等因素有关系。

3. 网络性能测试

网络使用 flannel 的 vxlan 模式,使用 iperf 进行测试。

服务端命令:

iperf3 -s -p 12345 -i 1

客户端命令:

iperf3 -c ${server-ip} -p 12345 -i 1 -t 10

3.1 node节点之间

shell 复制代码
# node1启动iperf服务端
[root@k8s-node1 ~]# iperf3 -s -p 12345 -i 1
-----------------------------------------------------------
Server listening on 12345

# node启动iperf客户端测试
[root@k8s-node2 ~]# iperf3 -c 192.168.0.52 -p 12345 -i 1 -t 10
Connecting to host 192.168.0.52, port 12345
[  4] local 192.168.0.53 port 52106 connected to 192.168.0.52 port 12345
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  4]   0.00-1.00   sec   313 MBytes  2.62 Gbits/sec    0   1.38 MBytes
[  4]   1.00-2.00   sec   379 MBytes  3.18 Gbits/sec    5   1.36 MBytes
[  4]   2.00-3.00   sec   366 MBytes  3.06 Gbits/sec    0   1.47 MBytes
[  4]   3.00-4.00   sec   360 MBytes  3.02 Gbits/sec    0   1.57 MBytes
[  4]   4.00-5.00   sec   431 MBytes  3.62 Gbits/sec    0   1.65 MBytes
[  4]   5.00-6.00   sec   391 MBytes  3.27 Gbits/sec    0   1.71 MBytes
[  4]   6.00-7.00   sec   404 MBytes  3.39 Gbits/sec    0   1.76 MBytes
[  4]   7.00-8.00   sec   378 MBytes  3.18 Gbits/sec    0   1.78 MBytes
[  4]   8.00-9.00   sec   411 MBytes  3.44 Gbits/sec    0   1.80 MBytes
[  4]   9.00-10.00  sec   410 MBytes  3.43 Gbits/sec    0   1.81 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-10.00  sec  3.75 GBytes  3.22 Gbits/sec    5             sender
[  4]   0.00-10.00  sec  3.75 GBytes  3.22 Gbits/sec                  receiver

3.2 不同node的 Pod 之间 ( flannel vxlan 模式)

shell 复制代码
# 部署两个pod,分布于两个node节点
[root@k8s-master ~]# cat perf-deploy.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: centos-perf
  labels:
    app: perf
spec:
  replicas: 2
  selector:
    matchLabels:
      app: perf
  template:
    metadata:
      labels:
        app: perf
    spec:
      containers:
      - name: perf
        image: centos79-perftools:20230425
        command: ["/bin/bash", "-c", "while true; do sleep 10000; done"]
        resources:
          requests:
            memory: "64Mi"
            cpu: "250m"
          limits:
            memory: "2048Mi"
            cpu: "2000m"

[root@k8s-master ~]# kubectl apply -f perf-deploy.yaml
deployment.apps/centos-perf created
[root@k8s-master ~]# kubectl get pod -o wide
NAME               READY   STATUS    RESTARTS   AGE     IP             NODE         NOMINATED NODE   READINESS GATES
centos-perf-5b897965bc-cjqwt 1/1     Running   0          8m49s   10.244.2.148   k8s-node2    <none>           <none>
centos-perf-5b897965bc-vbqdg 1/1     Running   0          8m47s   10.244.1.137   k8s-node1    <none>           <none>
...

# 在node1的pod中启动iperf服务端
[root@k8s-master ~]# kubectl exec -it centos-perf-5b897965bc-vbqdg /bin/bash
[root@centos-perf-5b897965bc-vbqdg /]# iperf3 -s -p 12345 -i 1
-----------------------------------------------------------
Server listening on 12345
-----------------------------------------------------------
Accepted connection from 10.244.2.148, port 33778
[  5] local 10.244.1.136 port 12345 connected to 10.244.2.148 port 33780
[ ID] Interval           Transfer     Bandwidth

# 在node2的pod中启动iperf客户端测试
[root@k8s-master ~]# kubectl exec -it centos-perf-5b897965bc-cjqwt /bin/bash
[root@centos-perf-5b897965bc-cjqwt /]# iperf3 -c 10.244.1.137 -p 12345 -i 1 -t 10
Connecting to host 10.244.1.136, port 12345
[  4] local 10.244.2.147 port 33780 connected to 10.244.1.137 port 12345
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  4]   0.00-1.00   sec   196 MBytes  1.64 Gbits/sec  741    584 KBytes
[  4]   1.00-2.00   sec   301 MBytes  2.53 Gbits/sec  2212    771 KBytes
[  4]   2.00-3.00   sec   199 MBytes  1.67 Gbits/sec  1147    912 KBytes
[  4]   3.00-4.00   sec   189 MBytes  1.59 Gbits/sec  387   1.01 MBytes
[  4]   4.00-5.00   sec   209 MBytes  1.75 Gbits/sec  138   1.14 MBytes
[  4]   5.00-6.00   sec   218 MBytes  1.83 Gbits/sec   92   1.26 MBytes
[  4]   6.00-7.00   sec   195 MBytes  1.64 Gbits/sec    0   1.36 MBytes
[  4]   7.00-8.00   sec   235 MBytes  1.97 Gbits/sec   33   1.46 MBytes
[  4]   8.00-9.00   sec   210 MBytes  1.76 Gbits/sec   46   1.55 MBytes
[  4]   9.00-10.01  sec   246 MBytes  2.06 Gbits/sec  171   1.65 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-10.01  sec  2.15 GBytes  1.84 Gbits/sec  4967             sender
[  4]   0.00-10.01  sec  2.14 GBytes  1.84 Gbits/sec                  receiver

3.3 Node 与不同node的 Pod 之间(flannel vxlan 模式)

shell 复制代码
# node1启动iperf服务端
[root@k8s-node1 ~]# iperf3 -s -p 12345 -i 1
-----------------------------------------------------------
Server listening on 12345

# 在node2的pod中启动iperf客户端测试
[root@k8s-master ~]# kubectl get pod -o wide
NAME                        READY   STATUS    RESTARTS   AGE     IP             NODE         NOMINATED NODE   READINESS GATES
centos-perf-5b897965bc-cjqwt 1/1     Running   0          8m49s   10.244.2.148   k8s-node2    <none>           <none>
centos-perf-5b897965bc-vbqdg 1/1     Running   0          8m47s   10.244.1.137   k8s-node1    <none>           <none>
[root@k8s-master ~]# kubectl exec -it centos-perf-5b897965bc-cjqwt bin/bash
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
[root@centos-perf-5b897965bc-cjqwt /]# iperf3 -c 192.168.0.52 -p 12345 -i 1 -t 10
Connecting to host 192.168.0.52, port 12345
[  4] local 10.244.2.148 port 52528 connected to 192.168.0.52 port 12345
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  4]   0.00-1.02   sec   173 MBytes  1.43 Gbits/sec   51    515 KBytes
[  4]   1.02-2.01   sec   219 MBytes  1.84 Gbits/sec  330    603 KBytes
[  4]   2.01-3.01   sec   309 MBytes  2.60 Gbits/sec   47    875 KBytes
[  4]   3.01-4.00   sec   270 MBytes  2.28 Gbits/sec  249    838 KBytes
[  4]   4.00-5.00   sec   262 MBytes  2.20 Gbits/sec  140    997 KBytes
[  4]   5.00-6.00   sec   301 MBytes  2.53 Gbits/sec   60   1.11 MBytes
[  4]   6.00-7.00   sec   302 MBytes  2.54 Gbits/sec   64   1.23 MBytes
[  4]   7.00-8.00   sec   349 MBytes  2.92 Gbits/sec    0   1.33 MBytes
[  4]   8.00-9.00   sec   321 MBytes  2.70 Gbits/sec  159   1.42 MBytes
[  4]   9.00-10.00  sec   312 MBytes  2.62 Gbits/sec   19   1.50 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-10.00  sec  2.75 GBytes  2.36 Gbits/sec  1119             sender
[  4]   0.00-10.00  sec  2.75 GBytes  2.36 Gbits/sec                  receiver

3.4 不同node的 Pod 之间( flannel host-gw 模式)

修改flannel网络模式为host-gw:

shell 复制代码
# 修改flannel的backend为host-gw
[root@k8s-master ~]# kubectl get configmap -n kube-flannel kube-flannel-cfg -o yaml > kube-flannel-cfg.yaml
[root@k8s-master ~]# vim kube-flannel-cfg.yaml
...
  net-conf.json: |
    {
      "Network": "10.244.0.0/16",
      "EnableNFTables": false,
      "Backend": {
        "Type": "host-gw"		# 默认为vxlan,修改为host-gw
      }
    }
...

# 重启flannel-ds
[root@k8s-master kube-flannel]# kubectl get pod -n kube-flannel
NAME                    READY   STATUS    RESTARTS   AGE
kube-flannel-ds-jvff5   1/1     Running   13         13d
kube-flannel-ds-n5fqt   1/1     Running   13         13d
kube-flannel-ds-wwfmk   1/1     Running   14         13d
[root@k8s-master kube-flannel]# kubectl delete pod -n kube-flannel kube-flannel-ds-jvff5 kube-flannel-ds-n5fqt kube-flannel-ds-wwfmk
pod "kube-flannel-ds-jvff5" deleted
pod "kube-flannel-ds-n5fqt" deleted
pod "kube-flannel-ds-wwfmk" deleted
[root@k8s-master kube-flannel]# kubectl get pod -n kube-flannel
NAME                    READY   STATUS    RESTARTS   AGE
kube-flannel-ds-2p9gp   1/1     Running   0          13s
kube-flannel-ds-cn9x4   1/1     Running   0          23s
kube-flannel-ds-t7zjj   1/1     Running   0          18s

测试网络性能:

shell 复制代码
[root@k8s-master ~]# kubectl get pod -o wide
NAME                           READY   STATUS    RESTARTS   AGE   IP             NODE         NOMINATED NODE   READINESS GATES
centos-perf-5b897965bc-cjqwt   1/1     Running   0          35m   10.244.2.148   k8s-node2    <none>      <none>
centos-perf-5b897965bc-vbqdg   1/1     Running   0          35m   10.244.1.137   k8s-node1    <none>      <none>

# 在node1节点上进入pod,启动iperf服务端
[root@k8s-master ~]# kubectl exec -it centos-perf-5b897965bc-vbqdg /bin/bash
[root@centos-perf-5b897965bc-vbqdg /]# iperf3 -s -p 12345 -i 1
-----------------------------------------------------------
Server listening on 12345
-----------------------------------------------------------
Accepted connection from 10.244.2.148, port 33778
[  5] local 10.244.1.136 port 12345 connected to 10.244.2.148 port 33780
[ ID] Interval           Transfer     Bandwidth

# 在node2节点上进入pod,启动iperf客户端连接node1节点的pod测试
[root@k8s-master ~]# kubectl exec -it centos-perf-5b897965bc-cjqwt /bin/bash
[root@centos-perf-5b897965bc-cjqwt /]# iperf3 -c 10.244.1.137 -p 12345 -i 1 -t 10
Connecting to host 10.244.1.137, port 12345
[  4] local 10.244.2.148 port 55200 connected to 10.244.1.137 port 12345
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  4]   0.00-1.00   sec   225 MBytes  1.88 Gbits/sec  1371    401 KBytes
[  4]   1.00-2.00   sec   242 MBytes  2.03 Gbits/sec  905    527 KBytes
[  4]   2.00-3.00   sec   244 MBytes  2.05 Gbits/sec  589    528 KBytes
[  4]   3.00-4.01   sec   292 MBytes  2.44 Gbits/sec  462    460 KBytes
[  4]   4.01-5.00   sec   242 MBytes  2.04 Gbits/sec  557    486 KBytes
[  4]   5.00-6.00   sec   287 MBytes  2.41 Gbits/sec  551    418 KBytes
[  4]   6.00-7.00   sec   314 MBytes  2.63 Gbits/sec  519    404 KBytes
[  4]   7.00-8.01   sec   313 MBytes  2.60 Gbits/sec  798    522 KBytes
[  4]   8.01-9.01   sec   297 MBytes  2.49 Gbits/sec  1902    478 KBytes
[  4]   9.01-10.00  sec   337 MBytes  2.86 Gbits/sec  1301    679 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-10.00  sec  2.73 GBytes  2.34 Gbits/sec  8955             sender
[  4]   0.00-10.00  sec  2.73 GBytes  2.34 Gbits/sec                  receiver

3.5 Node 与不同node的 Pod 之间( flannel host-gw 模式)

shell 复制代码
# # 在node2节点上进入pod,启动iperf客户端连接node1节点测试
[root@centos-perf-5b897965bc-cjqwt /]# iperf3 -c 192.168.0.52 -p 12345 -i 1 -t 10
Connecting to host 192.168.0.52, port 12345
[  4] local 10.244.2.148 port 53868 connected to 192.168.0.52 port 12345
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  4]   0.00-1.01   sec   185 MBytes  1.54 Gbits/sec  171    453 KBytes
[  4]   1.01-2.00   sec   221 MBytes  1.86 Gbits/sec   84    728 KBytes
[  4]   2.00-3.00   sec   264 MBytes  2.21 Gbits/sec    0    927 KBytes
[  4]   3.00-4.00   sec   271 MBytes  2.28 Gbits/sec   33   1.06 MBytes
[  4]   4.00-5.00   sec   376 MBytes  3.16 Gbits/sec  368   1.22 MBytes
[  4]   5.00-6.00   sec   314 MBytes  2.63 Gbits/sec  138   1.35 MBytes
[  4]   6.00-7.00   sec   368 MBytes  3.08 Gbits/sec    0   1.49 MBytes
[  4]   7.00-8.00   sec   406 MBytes  3.41 Gbits/sec   65   1.57 MBytes
[  4]   8.00-9.00   sec   340 MBytes  2.85 Gbits/sec  355   1.63 MBytes
[  4]   9.00-10.00  sec   358 MBytes  3.00 Gbits/sec    0   1.68 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-10.00  sec  3.03 GBytes  2.60 Gbits/sec  1214             sender
[  4]   0.00-10.00  sec  3.03 GBytes  2.60 Gbits/sec                  receiver

3.6 网络性能对比综述

场景 flannel网络模式 带宽(Gbits/sec)
不同node之间 不涉及 3.22
不同node的pod之间 vxlan 1.84
node与不同node的pod之间 vxlan 2.36
不同node的pod之间 host-gw 2.34
node与不同node的pod之间 host-gw 2.60

从上述数据得出结论:

  1. Flannel 的 vxlan 模式网络性能(pod之间)相比宿主机直接互联的损耗 43%,基本符合网上流传的测试结论(损耗30%-40%);
  2. flannel 的 host-gw 模式网络性能(pod之间)相比宿主机互连的网络性能损耗大约是 27%;
  3. vxlan 会有一个封包解包的过程,所以会对网络性能造成较大的损耗,而 host-gw 模式是直接使用路由信息,网络损耗小。
相关推荐
程序员小寒8 分钟前
前端性能优化之白屏、卡顿指标和网络环境采集篇
前端·javascript·网络·性能优化
wal13145201 小时前
OpenClaw教程(九)—— 彻底告别!OpenClaw 卸载不残留指南
前端·网络·人工智能·chrome·安全·openclaw
丈剑走天涯1 小时前
kubernetes java app 部署使用harbor私服 问题集合
java·容器·kubernetes
白藏y2 小时前
【协议】SSE协议和WebSocket协议
网络·websocket·网络协议
运维行者_2 小时前
网络监控方案从零开始 -- 企业级完整指南
大数据·运维·服务器·网络·数据库·人工智能·自动化
朱一头zcy3 小时前
简单理解NAT(网络地址转换)模式和桥接模式
网络·桥接模式·nat
Jinkxs3 小时前
Java 部署:滚动更新(K8s RollingUpdate 策略)
java·开发语言·kubernetes
加农炮手Jinx3 小时前
Flutter 三方库 cloudflare 鸿蒙云边协同分发流适配精讲:直连全球高速存储网关阵列无缝吞吐海量动静态画像资源,构筑大吞吐业务级网络负载安全分流-适配鸿蒙 HarmonyOS ohos
网络·flutter·harmonyos
坚定的共产主义生产设备永不宕机4 小时前
动态路由协议
网络
lpfasd1234 小时前
Kubernetes (K8s) 底层早已不再直接使用 Docker 引擎了
java·docker·kubernetes