问题记录
- [环境:V1.29.xxx版本 VPA(版本vpa-release-0.12)](#环境:V1.29.xxx版本 VPA(版本vpa-release-0.12))
- [问题一:高版本kubernetes crictl拉取镜像失败,docker镜像拉取成功,无法被kubernetes引用](#问题一:高版本kubernetes crictl拉取镜像失败,docker镜像拉取成功,无法被kubernetes引用)
- [问题二:官方拉取的VPA运行vpa-up.sh时会抛Error adding request extensions defined via -addext 异常](#问题二:官方拉取的VPA运行vpa-up.sh时会抛Error adding request extensions defined via -addext 异常)
环境:V1.29.xxx版本 VPA(版本vpa-release-0.12)
V1.29.xxx版本在做VPA(版本vpa-release-0.12)部署和压力测试期间产生的问题并单独记录
问题一:高版本kubernetes crictl拉取镜像失败,docker镜像拉取成功,无法被kubernetes引用
1、拉取VPA源码
bash
git clone -b vpa-release-0.12 https://github.com/kubernetes/autoscaler.git
2、更改核心组件镜像地址
bash
[root@k8s-docker-master vertical-pod-autoscaler]# pwd
/root/autoscaler/vertical-pod-autoscaler
执行如下指令:
sed -i 's/k8s.gcr.io\/autoscaling/k8sgcrioautoscaling/' deploy/admission-controller-deployment.yaml
sed -i 's/k8s.gcr.io\/autoscaling/k8sgcrioautoscaling/' deploy/recommender-deployment.yaml
sed -i 's/k8s.gcr.io\/autoscaling/k8sgcrioautoscaling/' deploy/updater-deployment.yaml
node节点docker拉取的镜像,高版本kubernetes无法直接引用
bash
[root@k8s-docker-node1 ~]# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
k8sgcrioautoscaling/vpa-admission-controller 0.12.0 f4325d883003 2 years ago 53.9MB
k8sgcrioautoscaling/vpa-updater 0.12.0 61102daa54f3 2 years ago 54.1MB
k8sgcrioautoscaling/vpa-recommender 0.12.0 fe586fbf5e27 2 years ago 54.7MB
3、处理方式:
bash
[root@k8s-docker-node1 ~]# docker save k8sgcrioautoscaling/vpa-admission-controller:0.12.0 -o vpa-admission-controller.tar
[root@k8s-docker-node1 ~]# ll
总用量 162816
-rw-r--r--. 1 root root 1136 1月 16 20:27 1.txt
-rw-------. 1 root root 696 1月 10 17:04 anaconda-ks.cfg
-rw-------. 1 root root 55251968 1月 17 15:19 vpa-admission-controller.tar
-rw-------. 1 root root 56038912 1月 17 15:13 vpa-recommender.tar
-rw-------. 1 root root 55417344 1月 17 15:12 vpa-updater.tar
[root@k8s-docker-node1 ~]# ctr -n k8s.io images import vpa-admission-controller.tar
unpacking docker.io/k8sgcrioautoscaling/vpa-admission-controller:0.12.0 (sha256:0ce016274f4b87645e37d89dd18b9d04a60f8225e09668110af8efe7b727295d)...done
其他俩个同样执行import导入ctr k8s.io namespace让container引用,注意上述三个vpa核心组件yaml内镜像拉取策略默认是always,需要手动改为IfNotPreSent或者Nerver
4、原因:
高版本kubernetes采用的container做得runc搭建集群,docker拉取的镜像因为Daemon的原因导致container无法直接引用镜像启动为pod,可以参考CRI-O+PodMan方案,但pod必须是被副本控制器管理,以免宕机无daemon导致服务无法拉起,出现问题。
问题二:官方拉取的VPA运行vpa-up.sh时会抛Error adding request extensions defined via -addext 异常
1、运行vpa-up.sh启动vpa
bash
[root@k8s-docker-master hack]# pwd
/root/autoscaler/vertical-pod-autoscaler/hack
[root@k8s-docker-master hack]# ll
总用量 60
-rw-r--r--. 1 root root 565 1月 16 17:44 boilerplate.go.txt
-rwxr-xr-x. 1 root root 1810 1月 16 17:44 convert-alpha-objects.sh
-rwxr-xr-x. 1 root root 2032 1月 16 17:44 deploy-for-e2e.sh
-rwxr-xr-x. 1 root root 1947 1月 16 17:44 generate-crd-yaml.sh
-rwxr-xr-x. 1 root root 1271 1月 16 17:44 run-e2e.sh
-rwxr-xr-x. 1 root root 2049 1月 16 17:44 run-e2e-tests.sh
-rwxr-xr-x. 1 root root 1635 1月 16 17:44 update-codegen.sh
-rwxr-xr-x. 1 root root 2008 1月 16 17:44 update-kubernetes-deps-in-e2e.sh
-rwxr-xr-x. 1 root root 1274 1月 16 17:44 verify-codegen.sh
-rwxr-xr-x. 1 root root 738 1月 16 17:44 vpa-apply-upgrade.sh
-rwxr-xr-x. 1 root root 787 1月 16 17:44 vpa-down.sh
-rwxr-xr-x. 1 root root 1613 1月 16 17:44 vpa-process-yaml.sh
-rwxr-xr-x. 1 root root 2169 1月 16 17:44 vpa-process-yamls.sh
-rwxr-xr-x. 1 root root 739 1月 16 17:44 vpa-up.sh
-rwxr-xr-x. 1 root root 2306 1月 16 17:44 warn-obsolete-vpa-objects.sh
[root@k8s-docker-master hack]# ./vpa-up.sh
customresourcedefinition.apiextensions.k8s.io/verticalpodautoscalercheckpoints.autoscaling.k8s.io created
customresourcedefinition.apiextensions.k8s.io/verticalpodautoscalers.autoscaling.k8s.io created
clusterrole.rbac.authorization.k8s.io/system:metrics-reader created
clusterrole.rbac.authorization.k8s.io/system:vpa-actor created
clusterrole.rbac.authorization.k8s.io/system:vpa-checkpoint-actor created
clusterrole.rbac.authorization.k8s.io/system:evictioner created
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-reader created
clusterrolebinding.rbac.authorization.k8s.io/system:vpa-actor created
clusterrolebinding.rbac.authorization.k8s.io/system:vpa-checkpoint-actor created
clusterrole.rbac.authorization.k8s.io/system:vpa-target-reader created
clusterrolebinding.rbac.authorization.k8s.io/system:vpa-target-reader-binding created
clusterrolebinding.rbac.authorization.k8s.io/system:vpa-evictionter-binding created
serviceaccount/vpa-admission-controller created
clusterrole.rbac.authorization.k8s.io/system:vpa-admission-controller created
clusterrolebinding.rbac.authorization.k8s.io/system:vpa-admission-controller created
clusterrole.rbac.authorization.k8s.io/system:vpa-status-reader created
clusterrolebinding.rbac.authorization.k8s.io/system:vpa-status-reader-binding created
serviceaccount/vpa-updater created
deployment.apps/vpa-updater created
serviceaccount/vpa-recommender created
deployment.apps/vpa-recommender created
Generating certs for the VPA Admission Controller in /tmp/vpa-certs.
Error adding request extensions defined via -addext
009EDBD9DF7F0000:error:0580008C:x509 certificate routines:X509at_add1_attr_by_NID:duplicate attribute:crypto/x509/x509_att.c:194:
deployment.apps/vpa-admission-controller created
service/vpa-webhook created
报错信息:
bash
Generating certs for the VPA Admission Controller in /tmp/vpa-certs.
Error adding request extensions defined via -addext
009EDBD9DF7F0000:error:0580008C:x509 certificate routines:X509at_add1_attr_by_NID:duplicate attribute:crypto/x509/x509_att.c:194:
意思是:表明在生成 VPA(Vertical Pod Autoscaler)Admission Controller 的证书时,出现了一个与请求扩展相关的问题。具体来说,错误提示 duplicate attribute 表示在添加请求扩展时,存在重复的属性。
2、现象:
bash
[root@k8s-docker-master hack]# kubectl get pod -A -owide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
calico-apiserver calico-apiserver-7bccbcbb5-d7wfl 1/1 Running 2 (4h1m ago) 2d3h 10.244.116.137 k8s-docker-master <none> <none>
calico-apiserver calico-apiserver-7bccbcbb5-j2g8k 1/1 Running 2 (4h1m ago) 2d3h 10.244.116.136 k8s-docker-master <none> <none>
calico-system calico-kube-controllers-6bd5f8b7b8-cr8g4 1/1 Running 1 (4h1m ago) 2d3h 10.244.116.138 k8s-docker-master <none> <none>
calico-system calico-node-cdt57 1/1 Running 1 (4h1m ago) 2d3h 192.168.234.133 k8s-docker-master <none> <none>
calico-system calico-node-g27dg 1/1 Running 1 (4h1m ago) 2d 192.168.234.134 k8s-docker-node1 <none> <none>
calico-system calico-typha-f75945f64-hbhnn 1/1 Running 1 (4h1m ago) 2d3h 192.168.234.133 k8s-docker-master <none> <none>
calico-system csi-node-driver-mgzhp 2/2 Running 2 (4h1m ago) 2d3h 10.244.116.140 k8s-docker-master <none> <none>
calico-system csi-node-driver-mknl4 2/2 Running 2 (4h1m ago) 2d 10.244.107.11 k8s-docker-node1 <none> <none>
kube-system coredns-857d9ff4c9-8cl4p 1/1 Running 1 (4h1m ago) 2d5h 10.244.116.139 k8s-docker-master <none> <none>
kube-system coredns-857d9ff4c9-nt2vf 1/1 Running 1 (4h1m ago) 2d5h 10.244.116.135 k8s-docker-master <none> <none>
kube-system etcd-k8s-docker-master 1/1 Running 2 (4h1m ago) 2d5h 192.168.234.133 k8s-docker-master <none> <none>
kube-system kube-apiserver-k8s-docker-master 1/1 Running 2 (4h1m ago) 2d5h 192.168.234.133 k8s-docker-master <none> <none>
kube-system kube-controller-manager-k8s-docker-master 1/1 Running 4 (4h1m ago) 2d5h 192.168.234.133 k8s-docker-master <none> <none>
kube-system kube-proxy-44cf5 1/1 Running 1 (4h1m ago) 2d 192.168.234.134 k8s-docker-node1 <none> <none>
kube-system kube-proxy-7znvm 1/1 Running 1 (4h1m ago) 2d5h 192.168.234.133 k8s-docker-master <none> <none>
kube-system kube-scheduler-k8s-docker-master 1/1 Running 5 (4h1m ago) 2d5h 192.168.234.133 k8s-docker-master <none> <none>
kube-system metrics-server-86f484f564-dsnb7 1/1 Running 1 (4h1m ago) 47h 10.244.107.9 k8s-docker-node1 <none> <none>
kube-system vpa-admission-controller-b6d8f7-922j5 0/1 ContainerCreating 0 18m <none> k8s-docker-node1 <none> <none>
kube-system vpa-recommender-67c99b46db-dxxf8 1/1 Running 0 18m 10.244.107.18 k8s-docker-node1 <none> <none>
kube-system vpa-updater-7c4dd558df-jtpb7 1/1 Running 0 18m 10.244.107.17 k8s-docker-node1 <none> <none>
tigera-operator tigera-operator-7d5ddd8b6d-4qx7w 1/1 Running 5 (4h1m ago) 2d3h 192.168.234.133 k8s-docker-master <none> <none>
vpa-admission-controller-b6d8f7-922j5 event信息:
bash
MountVolume.SetUp failed for volume "tls-certs" : secret "vpa-tls-certs" not found 无法正常running
3、排查:
发现gencerts.sh进行证书创建,直接去看看情况
bash
[root@k8s-docker-master autoscaler]# grep -rn "vpa-tls-certs" ./*
./vertical-pod-autoscaler/pkg/admission-controller/gencerts.sh:58:kubectl create secret --namespace=kube-system generic vpa-tls-certs --from-file=${TMP_DIR}/caKey.pem --from-file=${TMP_DIR}/caCert.pem --from-file=${TMP_DIR}/serverKey.pem --from-file=${TMP_DIR}/serverCert.pem
./vertical-pod-autoscaler/pkg/admission-controller/rmcerts.sh:23:kubectl delete secret --namespace=kube-system vpa-tls-certs
./vertical-pod-autoscaler/deploy/admission-controller-deployment.yaml:48: secretName: vpa-tls-certs
查看这个脚本,联系报错给的提示查询addext,发现可能是openssl创建证书的时候已经指定了一个conf文件,内有subjectAltName属性,但openssl又重复addext这个属性导致的创建证书报错,可以手动试试看执行后的报错信息
手动执行和报错信息:
bash
[root@k8s-docker-master admission-controller]# openssl req -new -key /tmp/vpa-certs/serverKey.pem -out /tmp/vpa-certs/server.csr -subj "/CN=vpa-webhook.kube-system.svc" -config /tmp/vpa-certs/server.conf -addext "subjectAltName = DNS:vpa-webhook.kube-system.svc"
Error adding request extensions defined via -addext
006E78D50C7F0000:error:0580008C:x509 certificate routines:X509at_add1_attr_by_NID:duplicate attribute:crypto/x509/x509_att.c:194:
4、处理方式:
发现报错信息一致,就是这个指令有问题,去掉addext的属性和值重新bash该脚本,发现vpa-admission-controller已正常运行
bash
去掉这段内容:-addext "subjectAltName = DNS:vpa-webhook.kube-system.svc"
bash
[root@k8s-docker-master admission-controller]# vim gencerts.sh
[root@k8s-docker-master admission-controller]# bash gencerts.sh
Generating certs for the VPA Admission Controller in /tmp/vpa-certs.
Certificate request self-signature ok
subject=CN=vpa-webhook.kube-system.svc
Uploading certs to the cluster.
secret/vpa-tls-certs created
Deleting /tmp/vpa-certs.
[root@k8s-docker-master admission-controller]# kubectl get pod -A
NAMESPACE NAME READY STATUS RESTARTS AGE
calico-apiserver calico-apiserver-7bccbcbb5-d7wfl 1/1 Running 2 (4h35m ago) 2d3h
calico-apiserver calico-apiserver-7bccbcbb5-j2g8k 1/1 Running 2 (4h35m ago) 2d3h
calico-system calico-kube-controllers-6bd5f8b7b8-cr8g4 1/1 Running 1 (4h35m ago) 2d4h
calico-system calico-node-cdt57 1/1 Running 1 (4h35m ago) 2d4h
calico-system calico-node-g27dg 1/1 Running 1 (4h35m ago) 2d1h
calico-system calico-typha-f75945f64-hbhnn 1/1 Running 1 (4h35m ago) 2d4h
calico-system csi-node-driver-mgzhp 2/2 Running 2 (4h35m ago) 2d4h
calico-system csi-node-driver-mknl4 2/2 Running 2 (4h35m ago) 2d1h
kube-system coredns-857d9ff4c9-8cl4p 1/1 Running 1 (4h35m ago) 2d5h
kube-system coredns-857d9ff4c9-nt2vf 1/1 Running 1 (4h35m ago) 2d5h
kube-system etcd-k8s-docker-master 1/1 Running 2 (4h35m ago) 2d5h
kube-system kube-apiserver-k8s-docker-master 1/1 Running 2 (4h35m ago) 2d5h
kube-system kube-controller-manager-k8s-docker-master 1/1 Running 4 (4h35m ago) 2d5h
kube-system kube-proxy-44cf5 1/1 Running 1 (4h35m ago) 2d1h
kube-system kube-proxy-7znvm 1/1 Running 1 (4h35m ago) 2d5h
kube-system kube-scheduler-k8s-docker-master 1/1 Running 5 (4h35m ago) 2d5h
kube-system metrics-server-86f484f564-dsnb7 1/1 Running 1 (4h35m ago) 2d
kube-system vpa-admission-controller-b6d8f7-scwnm 1/1 Running 0 4s
kube-system vpa-recommender-67c99b46db-dxxf8 1/1 Running 0 52m
kube-system vpa-updater-7c4dd558df-jtpb7 1/1 Running 0 52m