学习记录---Kubernetes的资源指标管道-metrics api的安装部署

一、简介

Metrics API,为我们的k8s集群提供了一组基本的指标(资源的cpu和内存),我们可以通过metrics api来对我们的pod开展HPA和VPA操作(主要通过在pod中对cpu和内存的限制实现动态扩展),也可以通过kubectl top的方式,获取k8s中node和pod的cpu及内存使用情况。

node资源分配情况和使用情况:

pod资源分配情况和使用情况:

二、使用和部署

从kubernetes的官网上来看,k8s初始化后默认是没有提供metrics api服务的,如果需要使用该服务,则需要部署metrics-server。

针对metrics-server的部署方式,我们可以直接从官网的超链接中获取,而该项目则是在github的kubernetes项目中:

复制代码
https://github.com/kubernetes-sigs/metrics-server/tree/master

在该项目中有比较明确的安装步骤,该项目安装比较简单,直接用其yaml进行apply即可(这里我们选择使用高可用模式,高可用模式其实就是多使用了几个replicas):

复制代码
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/high-availability-1.21+.yaml

我们先把yaml下载下来看看:

复制代码
apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    k8s-app: metrics-server
    rbac.authorization.k8s.io/aggregate-to-admin: "true"
    rbac.authorization.k8s.io/aggregate-to-edit: "true"
    rbac.authorization.k8s.io/aggregate-to-view: "true"
  name: system:aggregated-metrics-reader
rules:
- apiGroups:
  - metrics.k8s.io
  resources:
  - pods
  - nodes
  verbs:
  - get
  - list
  - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    k8s-app: metrics-server
  name: system:metrics-server
rules:
- apiGroups:
  - ""
  resources:
  - nodes/metrics
  verbs:
  - get
- apiGroups:
  - ""
  resources:
  - pods
  - nodes
  verbs:
  - get
  - list
  - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server-auth-reader
  namespace: kube-system
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: extension-apiserver-authentication-reader
subjects:
- kind: ServiceAccount
  name: metrics-server
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server:system:auth-delegator
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:auth-delegator
subjects:
- kind: ServiceAccount
  name: metrics-server
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  labels:
    k8s-app: metrics-server
  name: system:metrics-server
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:metrics-server
subjects:
- kind: ServiceAccount
  name: metrics-server
  namespace: kube-system
---
apiVersion: v1
kind: Service
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server
  namespace: kube-system
spec:
  ports:
  - name: https
    port: 443
    protocol: TCP
    targetPort: https
  selector:
    k8s-app: metrics-server
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server
  namespace: kube-system
spec:
  replicas: 2
  selector:
    matchLabels:
      k8s-app: metrics-server
  strategy:
    rollingUpdate:
      maxUnavailable: 1
  template:
    metadata:
      labels:
        k8s-app: metrics-server
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchLabels:
                k8s-app: metrics-server
            namespaces:
            - kube-system
            topologyKey: kubernetes.io/hostname
      containers:
      - args:
        - --cert-dir=/tmp
        - --secure-port=4443
        - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
        - --kubelet-use-node-status-port
        - --metric-resolution=15s
        #新增,使其不验证k8s提供的ca证书
        - --kubelet-insecure-tls
        #修改,修改为我们可以下载的镜像,如果有私有仓库也可以用私有仓库的地址
        #image: registry.k8s.io/metrics-server/metrics-server:v0.6.4
        image: bitnami/metrics-server:0.6.4
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 3
          httpGet:
            path: /livez
            port: https
            scheme: HTTPS
          periodSeconds: 10
        name: metrics-server
        ports:
        - containerPort: 4443
          name: https
          protocol: TCP
        readinessProbe:
          failureThreshold: 3
          httpGet:
            path: /readyz
            port: https
            scheme: HTTPS
          initialDelaySeconds: 20
          periodSeconds: 10
        resources:
          requests:
            cpu: 100m
            memory: 200Mi
        securityContext:
          allowPrivilegeEscalation: false
          readOnlyRootFilesystem: true
          runAsNonRoot: true
          runAsUser: 1000
        volumeMounts:
        - mountPath: /tmp
          name: tmp-dir
      nodeSelector:
        kubernetes.io/os: linux
      priorityClassName: system-cluster-critical
      serviceAccountName: metrics-server
      volumes:
      - emptyDir: {}
        name: tmp-dir
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: metrics-server
  namespace: kube-system
spec:
  minAvailable: 1
  selector:
    matchLabels:
      k8s-app: metrics-server
---
apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
  labels:
    k8s-app: metrics-server
  name: v1beta1.metrics.k8s.io
spec:
  group: metrics.k8s.io
  groupPriorityMinimum: 100
  insecureSkipTLSVerify: true
  service:
    name: metrics-server
    namespace: kube-system
  version: v1beta1
  versionPriority: 100

这里,我们修改了两个地方:

复制代码
1) 在deployment中的container的arg参数中加入:- --kubelet-insecure-tls
2) 修改镜像资源

修改镜像资源主要是方便我们可以顺利pull镜像;

加入kubelet-insecure-tls的主要目的是metrics-server不验证k8s提供的ca证书,如果不加该参数,则可能会导致出现以下问题:

pod状态:

查看描述,会出现Readiness探针异常(Readiness probe failed:Http probe failed with statuscode: 500)

查看日志,出现以下错误:

"Failed to scrape node" err="Get "https://xxx.xxx.xxx.xxx:10250/metrics/resource": tls: failed to verify certificate: x509: cannot validate certificate for xxx.xxx.xxx.xxx because it doesn't contain any IP SANs" node="k8s-slave3"

正常状态:

相关推荐
xy_recording15 分钟前
Day08 Go语言学习
开发语言·学习·golang
黑客影儿1 小时前
黑客哲学之学习笔记系列(三)
笔记·学习·程序人生·安全·职场和发展·网络攻击模型·学习方法
一个天蝎座 白勺 程序猿4 小时前
Apache IoTDB(4):深度解析时序数据库 IoTDB 在Kubernetes 集群中的部署与实践指南
数据库·深度学习·kubernetes·apache·时序数据库·iotdb
xiao-xiang5 小时前
redis-集成prometheus监控(k8s)
数据库·redis·kubernetes·k8s·grafana·prometheus
MANONGMN12 小时前
Kubernetes(K8s)常用命令全解析:从基础到进阶
云原生·容器·kubernetes
风已经起了12 小时前
FPGA学习笔记——IIC协议简介
笔记·学习·fpga开发
lingggggaaaa12 小时前
小迪安全v2023学习笔记(六十二讲)—— PHP框架反序列化
笔记·学习·安全·web安全·网络安全·php·反序列化
Johny_Zhao13 小时前
基于 Docker 的 LLaMA-Factory 全流程部署指南
linux·网络·网络安全·信息安全·kubernetes·云计算·containerd·yum源·系统运维·llama-factory
我们从未走散13 小时前
JVM学习笔记-----StringTable
jvm·笔记·学习
胡萝卜3.014 小时前
数据结构初阶:排序算法(一)插入排序、选择排序
数据结构·笔记·学习·算法·排序算法·学习方法