K8S集群监控方案之Prometheus+kube-state-metrics+Grafana

序言 | Prometheus 中文文档

方案简单架构图

一、部署kube-state-metrics

1、部署文件下载

地址

kube-state-metrics/examples/standard at main · kubernetes/kube-state-metrics · GitHub

2、修改下载的文件

2.1、修改镜像

原镜像可能下载不了,这里修改deployment.yaml镜像:bitnami/kube-state-metrics:latest

2.2、修改service

这里修改为NodePort类型,可按自己需求更改

3、本案例下载更改好后的部署文件

cluster-role.yaml

bash 复制代码
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    app.kubernetes.io/component: exporter
    app.kubernetes.io/name: kube-state-metrics
    app.kubernetes.io/version: 2.12.0
  name: kube-state-metrics
rules:
- apiGroups:
  - ""
  resources:
  - configmaps
  - secrets
  - nodes
  - pods
  - services
  - serviceaccounts
  - resourcequotas
  - replicationcontrollers
  - limitranges
  - persistentvolumeclaims
  - persistentvolumes
  - namespaces
  - endpoints
  verbs:
  - list
  - watch
- apiGroups:
  - apps
  resources:
  - statefulsets
  - daemonsets
  - deployments
  - replicasets
  verbs:
  - list
  - watch
- apiGroups:
  - batch
  resources:
  - cronjobs
  - jobs
  verbs:
  - list
  - watch
- apiGroups:
  - autoscaling
  resources:
  - horizontalpodautoscalers
  verbs:
  - list
  - watch
- apiGroups:
  - authentication.k8s.io
  resources:
  - tokenreviews
  verbs:
  - create
- apiGroups:
  - authorization.k8s.io
  resources:
  - subjectaccessreviews
  verbs:
  - create
- apiGroups:
  - policy
  resources:
  - poddisruptionbudgets
  verbs:
  - list
  - watch
- apiGroups:
  - certificates.k8s.io
  resources:
  - certificatesigningrequests
  verbs:
  - list
  - watch
- apiGroups:
  - discovery.k8s.io
  resources:
  - endpointslices
  verbs:
  - list
  - watch
- apiGroups:
  - storage.k8s.io
  resources:
  - storageclasses
  - volumeattachments
  verbs:
  - list
  - watch
- apiGroups:
  - admissionregistration.k8s.io
  resources:
  - mutatingwebhookconfigurations
  - validatingwebhookconfigurations
  verbs:
  - list
  - watch
- apiGroups:
  - networking.k8s.io
  resources:
  - networkpolicies
  - ingressclasses
  - ingresses
  verbs:
  - list
  - watch
- apiGroups:
  - coordination.k8s.io
  resources:
  - leases
  verbs:
  - list
  - watch
- apiGroups:
  - rbac.authorization.k8s.io
  resources:
  - clusterrolebindings
  - clusterroles
  - rolebindings
  - roles
  verbs:
  - list
  - watch

cluster-role-binding.yaml

bash 复制代码
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  labels:
    app.kubernetes.io/component: exporter
    app.kubernetes.io/name: kube-state-metrics
    app.kubernetes.io/version: 2.12.0
  name: kube-state-metrics
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: kube-state-metrics
subjects:
- kind: ServiceAccount
  name: kube-state-metrics
  namespace: kube-system

deployment.yaml

bash 复制代码
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app.kubernetes.io/component: exporter
    app.kubernetes.io/name: kube-state-metrics
    app.kubernetes.io/version: 2.12.0
  name: kube-state-metrics
  namespace: kube-system
spec:
  replicas: 1
  selector:
    matchLabels:
      app.kubernetes.io/name: kube-state-metrics
  template:
    metadata:
      labels:
        app.kubernetes.io/component: exporter
        app.kubernetes.io/name: kube-state-metrics
        app.kubernetes.io/version: 2.12.0
    spec:
      automountServiceAccountToken: true
      containers:
      - image: bitnami/kube-state-metrics:latest
        livenessProbe:
          httpGet:
            path: /healthz
            port: 8080
          initialDelaySeconds: 5
          timeoutSeconds: 5
        name: kube-state-metrics
        ports:
        - containerPort: 8080
          name: http-metrics
        - containerPort: 8081
          name: telemetry
        readinessProbe:
          httpGet:
            path: /
            port: 8081
          initialDelaySeconds: 5
          timeoutSeconds: 5
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            drop:
            - ALL
          readOnlyRootFilesystem: true
          runAsNonRoot: true
          runAsUser: 65534
          seccompProfile:
            type: RuntimeDefault
      nodeSelector:
        kubernetes.io/os: linux
      serviceAccountName: kube-state-metrics

service-account.yaml

bash 复制代码
apiVersion: v1
automountServiceAccountToken: false
kind: ServiceAccount
metadata:
  labels:
    app.kubernetes.io/component: exporter
    app.kubernetes.io/name: kube-state-metrics
    app.kubernetes.io/version: 2.12.0
  name: kube-state-metrics
  namespace: kube-system

service.yaml

bash 复制代码
apiVersion: v1
kind: Service
metadata:
  labels:
    app.kubernetes.io/component: exporter
    app.kubernetes.io/name: kube-state-metrics
    app.kubernetes.io/version: 2.12.0
  name: kube-state-metrics
  namespace: kube-system
spec:
  type: NodePort
  ports:
  - name: http-metrics
    port: 8080
    targetPort: 8080
    nodePort: 31666
  - name: telemetry
    port: 8081
    targetPort: 8081
  selector:
    app.kubernetes.io/name: kube-state-metrics

4、部署kube-state-metrics

K8S@k8s-master kube-state-metrics\]$ ll 总用量 20 -rw-rw-r-- 1 K8S K8S 418 5月 17 16:11 cluster-role-binding.yaml -rw-rw-r-- 1 K8S K8S 1950 5月 17 16:11 cluster-role.yaml -rw-rw-r-- 1 K8S K8S 1471 5月 17 16:11 deployment.yaml -rw-rw-r-- 1 K8S K8S 270 5月 17 16:11 service-account.yaml -rw-rw-r-- 1 K8S K8S 453 5月 17 16:11 service.yaml \[K8S@k8s-master kube-state-metrics\]$ pwd /home/K8S/k8s-project/monitor/kube-state-metrics \[K8S@k8s-master kube-state-metrics\]$ kubectl apply -f ./

查看部署结果:

K8S@k8s-master kube-state-metrics\]$ kubectl get all -n kube-system

Running状态表示部署成功

访问Service暴露的31666端口,能成功返回

二、部署Prometheus

1、创建名称空间

K8S@k8s-master prometheus\]$ kubectl create ns monitoring

2、编写配置文件 prometheus.yml

bash 复制代码
apiVersion: v1
kind: ServiceAccount
metadata:
  name: prometheus
  namespace: monitoring
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: prometheus
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
- kind: ServiceAccount
  name: prometheus
  namespace: monitoring
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus-config
  namespace: monitoring
data:
  prometheus.yml: |
    global:
      scrape_interval: 15s      # 数据采集时间间隔
      scrape_timeout: 10s       # 数据采集超时时间
      evaluation_interval: 1m
    scrape_configs:
      - job_name: "prometheus"
        static_configs:
          - targets: ["localhost:9090"]
      - job_name: "k8s-info"
        static_configs:
          - targets: ["172.19.3.240:31666"]  #配置kube-state-metrice的数据源地址

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: prometheus
  namespace: monitoring
spec:
  replicas: 1
  selector:
    matchLabels:
      app: prometheus
  template:
    metadata:
      labels:
        app: prometheus
    spec:
      serviceAccountName: prometheus
      containers:
      - name: prometheus
        image: prom/prometheus:v2.31.1
        args:
          - --config.file=/etc/prometheus/prometheus.yml
          - --storage.tsdb.path=/prometheus
        ports:
        - containerPort: 9090
        volumeMounts:
        - name: config-volume
          mountPath: /etc/prometheus
        - name: storage-volume
          mountPath: /prometheus
      volumes:
      - name: config-volume
        configMap:
          name: prometheus-config
          items:
          - key: prometheus.yml
            path: prometheus.yml
      - name: storage-volume
        emptyDir: {}
---
apiVersion: v1
kind: Service
metadata:
  name: prometheus-service
  namespace: monitoring
spec:
  type: NodePort
  ports:
  - name: http
    port: 9090
    targetPort: 9090
    nodePort: 30007
  selector:
    app: prometheus

创建资源

K8S@k8s-master prometheus\]$ kubectl apply -f prometheus.yml

查看状态为Running,表示正常启动

访问Service暴露的30007端口 http://172.19.3.240:30007/targets

成功读取到kube-state-metrics提供的数据

prometheus部署成功

三、部署Grafana

官网安装参考

Deploy Grafana on Kubernetes | Grafana documentation

1、编写grafana.yaml

bash 复制代码
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: grafana-pvc
  namespace: monitoring
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
  storageClassName: managed-nfs-devops-storage
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: grafana
  name: grafana
  namespace: monitoring
spec:
  selector:
    matchLabels:
      app: grafana
  template:
    metadata:
      labels:
        app: grafana
    spec:
      securityContext:
        fsGroup: 472
        supplementalGroups:
          - 0
      containers:
        - name: grafana
          image: grafana/grafana:latest
          imagePullPolicy: IfNotPresent
          ports:
            - containerPort: 3000
              name: http-grafana
              protocol: TCP
          readinessProbe:
            failureThreshold: 3
            httpGet:
              path: /robots.txt
              port: 3000
              scheme: HTTP
            initialDelaySeconds: 10
            periodSeconds: 30
            successThreshold: 1
            timeoutSeconds: 2
          livenessProbe:
            failureThreshold: 3
            initialDelaySeconds: 30
            periodSeconds: 10
            successThreshold: 1
            tcpSocket:
              port: 3000
            timeoutSeconds: 1
          resources:
            requests:
              cpu: 250m
              memory: 750Mi
          volumeMounts:
            - mountPath: /var/lib/grafana
              name: grafana-pv
      volumes:
        - name: grafana-pv
          persistentVolumeClaim:
            claimName: grafana-pvc
---
apiVersion: v1
kind: Service
metadata:
  name: grafana
  namespace: monitoring
spec:
  ports:
    - port: 3000
      protocol: TCP
      targetPort: http-grafana
      nodePort: 30008
  selector:
    app: grafana
  sessionAffinity: None
  type: NodePort

2、执行命令创建资源

K8S@k8s-master monitor\]$ kubectl apply -f grafana.yaml persistentvolumeclaim/grafana-pvc created deployment.apps/grafana created service/grafana created

查看资源,Running,安装成功

3、访问暴露30008端口页面

默认登录账号密码 admin/admin,查看页面访问成功

四、Grafana通过页面配置k8s集群资源展示

1、配置数据源

步骤1

步骤2

步骤3

步骤4,配置Prometheus数据源

拉到最下边,点击保存测试

2、配置展示模板

如图操作进入import页面

访问官网模板页面 Dashboards | Grafana Labs

按如下图搜索点击

进入页面点击拷贝id,或者点下边下载模板也可以

回到我们的grafana import页面,粘贴ID或者导入下载的json

自定义修改数据点击import

查看资源模板页面,配置成功

相关推荐
荣光波比17 小时前
K8S(一)—— 云原生与Kubernetes(K8S)从入门到实践:基础概念与操作全解析
云原生·容器·kubernetes
hello_25018 小时前
k8s基础监控promql
云原生·容器·kubernetes
静谧之心20 小时前
在 K8s 上可靠运行 PD 分离推理:RBG 的设计与实现
云原生·容器·golang·kubernetes·开源·pd分离
1024find1 天前
Spark on k8s部署
大数据·运维·容器·spark·kubernetes
能不能别报错2 天前
K8s学习笔记(十六) 探针(Probe)
笔记·学习·kubernetes
能不能别报错2 天前
K8s学习笔记(十四) DaemonSet
笔记·学习·kubernetes
火星MARK2 天前
k8s面试题
容器·面试·kubernetes
赵渝强老师2 天前
【赵渝强老师】Docker容器的资源管理机制
linux·docker·容器·kubernetes
能不能别报错2 天前
K8s学习笔记(十五) pause容器与init容器
笔记·学习·kubernetes
稚辉君.MCA_P8_Java2 天前
kafka解决了什么问题?mmap 和sendfile
java·spring boot·分布式·kafka·kubernetes