K8S集群监控方案之Prometheus+kube-state-metrics+Grafana

序言 | Prometheus 中文文档

方案简单架构图

一、部署kube-state-metrics

1、部署文件下载

地址

kube-state-metrics/examples/standard at main · kubernetes/kube-state-metrics · GitHub

2、修改下载的文件

2.1、修改镜像

原镜像可能下载不了,这里修改deployment.yaml镜像:bitnami/kube-state-metrics:latest

2.2、修改service

这里修改为NodePort类型,可按自己需求更改

3、本案例下载更改好后的部署文件

cluster-role.yaml

bash 复制代码
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    app.kubernetes.io/component: exporter
    app.kubernetes.io/name: kube-state-metrics
    app.kubernetes.io/version: 2.12.0
  name: kube-state-metrics
rules:
- apiGroups:
  - ""
  resources:
  - configmaps
  - secrets
  - nodes
  - pods
  - services
  - serviceaccounts
  - resourcequotas
  - replicationcontrollers
  - limitranges
  - persistentvolumeclaims
  - persistentvolumes
  - namespaces
  - endpoints
  verbs:
  - list
  - watch
- apiGroups:
  - apps
  resources:
  - statefulsets
  - daemonsets
  - deployments
  - replicasets
  verbs:
  - list
  - watch
- apiGroups:
  - batch
  resources:
  - cronjobs
  - jobs
  verbs:
  - list
  - watch
- apiGroups:
  - autoscaling
  resources:
  - horizontalpodautoscalers
  verbs:
  - list
  - watch
- apiGroups:
  - authentication.k8s.io
  resources:
  - tokenreviews
  verbs:
  - create
- apiGroups:
  - authorization.k8s.io
  resources:
  - subjectaccessreviews
  verbs:
  - create
- apiGroups:
  - policy
  resources:
  - poddisruptionbudgets
  verbs:
  - list
  - watch
- apiGroups:
  - certificates.k8s.io
  resources:
  - certificatesigningrequests
  verbs:
  - list
  - watch
- apiGroups:
  - discovery.k8s.io
  resources:
  - endpointslices
  verbs:
  - list
  - watch
- apiGroups:
  - storage.k8s.io
  resources:
  - storageclasses
  - volumeattachments
  verbs:
  - list
  - watch
- apiGroups:
  - admissionregistration.k8s.io
  resources:
  - mutatingwebhookconfigurations
  - validatingwebhookconfigurations
  verbs:
  - list
  - watch
- apiGroups:
  - networking.k8s.io
  resources:
  - networkpolicies
  - ingressclasses
  - ingresses
  verbs:
  - list
  - watch
- apiGroups:
  - coordination.k8s.io
  resources:
  - leases
  verbs:
  - list
  - watch
- apiGroups:
  - rbac.authorization.k8s.io
  resources:
  - clusterrolebindings
  - clusterroles
  - rolebindings
  - roles
  verbs:
  - list
  - watch

cluster-role-binding.yaml

bash 复制代码
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  labels:
    app.kubernetes.io/component: exporter
    app.kubernetes.io/name: kube-state-metrics
    app.kubernetes.io/version: 2.12.0
  name: kube-state-metrics
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: kube-state-metrics
subjects:
- kind: ServiceAccount
  name: kube-state-metrics
  namespace: kube-system

deployment.yaml

bash 复制代码
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app.kubernetes.io/component: exporter
    app.kubernetes.io/name: kube-state-metrics
    app.kubernetes.io/version: 2.12.0
  name: kube-state-metrics
  namespace: kube-system
spec:
  replicas: 1
  selector:
    matchLabels:
      app.kubernetes.io/name: kube-state-metrics
  template:
    metadata:
      labels:
        app.kubernetes.io/component: exporter
        app.kubernetes.io/name: kube-state-metrics
        app.kubernetes.io/version: 2.12.0
    spec:
      automountServiceAccountToken: true
      containers:
      - image: bitnami/kube-state-metrics:latest
        livenessProbe:
          httpGet:
            path: /healthz
            port: 8080
          initialDelaySeconds: 5
          timeoutSeconds: 5
        name: kube-state-metrics
        ports:
        - containerPort: 8080
          name: http-metrics
        - containerPort: 8081
          name: telemetry
        readinessProbe:
          httpGet:
            path: /
            port: 8081
          initialDelaySeconds: 5
          timeoutSeconds: 5
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            drop:
            - ALL
          readOnlyRootFilesystem: true
          runAsNonRoot: true
          runAsUser: 65534
          seccompProfile:
            type: RuntimeDefault
      nodeSelector:
        kubernetes.io/os: linux
      serviceAccountName: kube-state-metrics

service-account.yaml

bash 复制代码
apiVersion: v1
automountServiceAccountToken: false
kind: ServiceAccount
metadata:
  labels:
    app.kubernetes.io/component: exporter
    app.kubernetes.io/name: kube-state-metrics
    app.kubernetes.io/version: 2.12.0
  name: kube-state-metrics
  namespace: kube-system

service.yaml

bash 复制代码
apiVersion: v1
kind: Service
metadata:
  labels:
    app.kubernetes.io/component: exporter
    app.kubernetes.io/name: kube-state-metrics
    app.kubernetes.io/version: 2.12.0
  name: kube-state-metrics
  namespace: kube-system
spec:
  type: NodePort
  ports:
  - name: http-metrics
    port: 8080
    targetPort: 8080
    nodePort: 31666
  - name: telemetry
    port: 8081
    targetPort: 8081
  selector:
    app.kubernetes.io/name: kube-state-metrics

4、部署kube-state-metrics

[K8S@k8s-master kube-state-metrics]$ ll

总用量 20

-rw-rw-r-- 1 K8S K8S 418 5月 17 16:11 cluster-role-binding.yaml

-rw-rw-r-- 1 K8S K8S 1950 5月 17 16:11 cluster-role.yaml

-rw-rw-r-- 1 K8S K8S 1471 5月 17 16:11 deployment.yaml

-rw-rw-r-- 1 K8S K8S 270 5月 17 16:11 service-account.yaml

-rw-rw-r-- 1 K8S K8S 453 5月 17 16:11 service.yaml

[K8S@k8s-master kube-state-metrics]$ pwd

/home/K8S/k8s-project/monitor/kube-state-metrics

[K8S@k8s-master kube-state-metrics]$ kubectl apply -f ./

查看部署结果:

[K8S@k8s-master kube-state-metrics]$ kubectl get all -n kube-system

Running状态表示部署成功

访问Service暴露的31666端口,能成功返回

二、部署Prometheus

1、创建名称空间

[K8S@k8s-master prometheus]$ kubectl create ns monitoring

2、编写配置文件 prometheus.yml

bash 复制代码
apiVersion: v1
kind: ServiceAccount
metadata:
  name: prometheus
  namespace: monitoring
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: prometheus
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
- kind: ServiceAccount
  name: prometheus
  namespace: monitoring
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus-config
  namespace: monitoring
data:
  prometheus.yml: |
    global:
      scrape_interval: 15s      # 数据采集时间间隔
      scrape_timeout: 10s       # 数据采集超时时间
      evaluation_interval: 1m
    scrape_configs:
      - job_name: "prometheus"
        static_configs:
          - targets: ["localhost:9090"]
      - job_name: "k8s-info"
        static_configs:
          - targets: ["172.19.3.240:31666"]  #配置kube-state-metrice的数据源地址

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: prometheus
  namespace: monitoring
spec:
  replicas: 1
  selector:
    matchLabels:
      app: prometheus
  template:
    metadata:
      labels:
        app: prometheus
    spec:
      serviceAccountName: prometheus
      containers:
      - name: prometheus
        image: prom/prometheus:v2.31.1
        args:
          - --config.file=/etc/prometheus/prometheus.yml
          - --storage.tsdb.path=/prometheus
        ports:
        - containerPort: 9090
        volumeMounts:
        - name: config-volume
          mountPath: /etc/prometheus
        - name: storage-volume
          mountPath: /prometheus
      volumes:
      - name: config-volume
        configMap:
          name: prometheus-config
          items:
          - key: prometheus.yml
            path: prometheus.yml
      - name: storage-volume
        emptyDir: {}
---
apiVersion: v1
kind: Service
metadata:
  name: prometheus-service
  namespace: monitoring
spec:
  type: NodePort
  ports:
  - name: http
    port: 9090
    targetPort: 9090
    nodePort: 30007
  selector:
    app: prometheus

创建资源

[K8S@k8s-master prometheus]$ kubectl apply -f prometheus.yml

查看状态为Running,表示正常启动

访问Service暴露的30007端口 http://172.19.3.240:30007/targets

成功读取到kube-state-metrics提供的数据

prometheus部署成功

三、部署Grafana

官网安装参考

Deploy Grafana on Kubernetes | Grafana documentation

1、编写grafana.yaml

bash 复制代码
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: grafana-pvc
  namespace: monitoring
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
  storageClassName: managed-nfs-devops-storage
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: grafana
  name: grafana
  namespace: monitoring
spec:
  selector:
    matchLabels:
      app: grafana
  template:
    metadata:
      labels:
        app: grafana
    spec:
      securityContext:
        fsGroup: 472
        supplementalGroups:
          - 0
      containers:
        - name: grafana
          image: grafana/grafana:latest
          imagePullPolicy: IfNotPresent
          ports:
            - containerPort: 3000
              name: http-grafana
              protocol: TCP
          readinessProbe:
            failureThreshold: 3
            httpGet:
              path: /robots.txt
              port: 3000
              scheme: HTTP
            initialDelaySeconds: 10
            periodSeconds: 30
            successThreshold: 1
            timeoutSeconds: 2
          livenessProbe:
            failureThreshold: 3
            initialDelaySeconds: 30
            periodSeconds: 10
            successThreshold: 1
            tcpSocket:
              port: 3000
            timeoutSeconds: 1
          resources:
            requests:
              cpu: 250m
              memory: 750Mi
          volumeMounts:
            - mountPath: /var/lib/grafana
              name: grafana-pv
      volumes:
        - name: grafana-pv
          persistentVolumeClaim:
            claimName: grafana-pvc
---
apiVersion: v1
kind: Service
metadata:
  name: grafana
  namespace: monitoring
spec:
  ports:
    - port: 3000
      protocol: TCP
      targetPort: http-grafana
      nodePort: 30008
  selector:
    app: grafana
  sessionAffinity: None
  type: NodePort

2、执行命令创建资源

[K8S@k8s-master monitor]$ kubectl apply -f grafana.yaml

persistentvolumeclaim/grafana-pvc created

deployment.apps/grafana created

service/grafana created

查看资源,Running,安装成功

3、访问暴露30008端口页面

默认登录账号密码 admin/admin,查看页面访问成功

四、Grafana通过页面配置k8s集群资源展示

1、配置数据源

步骤1

步骤2

步骤3

步骤4,配置Prometheus数据源

拉到最下边,点击保存测试

2、配置展示模板

如图操作进入import页面

访问官网模板页面 Dashboards | Grafana Labs

按如下图搜索点击

进入页面点击拷贝id,或者点下边下载模板也可以

回到我们的grafana import页面,粘贴ID或者导入下载的json

自定义修改数据点击import

查看资源模板页面,配置成功

相关推荐
景天科技苑9 小时前
【云原生开发】K8S多集群资源管理平台架构设计
云原生·容器·kubernetes·k8s·云原生开发·k8s管理系统
wclass-zhengge10 小时前
K8S篇(基本介绍)
云原生·容器·kubernetes
颜淡慕潇10 小时前
【K8S问题系列 |1 】Kubernetes 中 NodePort 类型的 Service 无法访问【已解决】
后端·云原生·容器·kubernetes·问题解决
川石课堂软件测试12 小时前
性能测试|docker容器下搭建JMeter+Grafana+Influxdb监控可视化平台
运维·javascript·深度学习·jmeter·docker·容器·grafana
昌sit!18 小时前
K8S node节点没有相应的pod镜像运行故障处理办法
云原生·容器·kubernetes
A ?Charis21 小时前
Gitlab-runner running on Kubernetes - hostAliases
容器·kubernetes·gitlab
北漂IT民工_程序员_ZG1 天前
k8s集群安装(minikube)
云原生·容器·kubernetes
2301_806131361 天前
Kubernetes的基本构建块和最小可调度单元pod-0
云原生·容器·kubernetes
逻辑与&&1 天前
[Prometheus学习笔记]从架构到案例,一站式教程
笔记·学习·prometheus
SilentCodeY1 天前
containerd配置私有仓库registry
容器·kubernetes·containerd·镜像·crictl