n9e categraf k8s监控配置 -kube-state-metrics

1、k8s集群中创建命名空间/角色/绑定关系/容器-kube-state-metrics

下载镜像 kube-state-metrics-v2.5.0.tar

bash 复制代码
docker load -i kube-state-metrics-v2.5.0.tar
docker images |grep kube-state-metrics
kubectl create namespace z-monitor
vi   kube-state-metrics.yaml
bash 复制代码
#Create by:ember.zhang
#为什么不问问神奇的海螺呢丶
apiVersion: v1
kind: ServiceAccount
metadata:
  name: kube-state-metrics
  namespace: z-monitor  # 命名空间 
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: kube-state-metrics
rules:
  - apiGroups: [""]
    resources:
      - configmaps
      - secrets
      - nodes
      - pods
      - services
      - resourcequotas
      - replicationcontrollers
      - limitranges
      - persistentvolumeclaims
      - persistentvolumes
      - namespaces
      - endpoints
    verbs: ["list", "watch"]
  - apiGroups: ["apps"]
    resources:
      - statefulsets
      - daemonsets
      - deployments
      - replicasets
    verbs: ["list", "watch"]
  - apiGroups: ["batch"]
    resources:
      - cronjobs
      - jobs
    verbs: ["list", "watch"]
  - apiGroups: ["autoscaling"]
    resources:
      - horizontalpodautoscalers
    verbs: ["list", "watch"]
  - apiGroups: ["policy"]
    resources:
      - poddisruptionbudgets
    verbs: ["list", "watch"]
  - apiGroups: ["certificates.k8s.io"]
    resources:
      - certificatesigningrequests
    verbs: ["list", "watch"]
  - apiGroups: ["storage.k8s.io"]
    resources:
      - storageclasses
      - volumeattachments
    verbs: ["list", "watch"]
  - apiGroups: ["apiextensions.k8s.io"]
    resources:
      - customresourcedefinitions
    verbs: ["list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: kube-state-metrics
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: kube-state-metrics
subjects:
  - kind: ServiceAccount
    name: kube-state-metrics
    namespace: z-monitor  # 命名空间
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: kube-state-metrics
  namespace: z-monitor  # 命名空间
  labels:
    app: kube-state-metrics
spec:
  replicas: 1  # 副本数
  selector:
    matchLabels:
      app: kube-state-metrics
  template:
    metadata:
      labels:
        app: kube-state-metrics
    spec:
      serviceAccountName: kube-state-metrics
      containers:
        - name: kube-state-metrics
          image: registry.k8s.io/kube-state-metrics/kube-state-metrics:v2.5.0  # 镜像版本
          imagePullPolicy: IfNotPresent  # 强制使用本地镜像
          ports:
            - name: http-metrics
              containerPort: 8080
            - name: telemetry
              containerPort: 8081
          resources:
            # 资源限制
            limits:
              cpu: 200m
              memory: 256Mi
            requests:
              cpu: 100m
              memory: 128Mi
          readinessProbe:
            httpGet:
              path: /healthz
              port: 8080
            initialDelaySeconds: 5
            timeoutSeconds: 5
          livenessProbe:
            httpGet:
              path: /healthz
              port: 8080
            initialDelaySeconds: 15
            timeoutSeconds: 5
---
apiVersion: v1
kind: Service
metadata:
  name: kube-state-metrics
  namespace: z-monitor  # 命名空间
  labels:
    app: kube-state-metrics
spec:
  type: NodePort  
  ports:
    - name: http-metrics
      port: 8080
      targetPort: 8080
      nodePort: 30880
      protocol: TCP
    - name: telemetry
      port: 8081
      targetPort: 8081
      nodePort: 30881
      protocol: TCP
  selector:
    app: kube-state-metrics
bash 复制代码
kubectl apply -f z-kube-state-metrics.yaml

2、categraf/conf/input.prometheus/ 增加配置文件

bash 复制代码
vi cim-prod.toml
bash 复制代码
interval = 15

[[instances]]
urls = [
    "http://10.12.8.xx:30880/metrics"
]

url_label_key = "instance"
url_label_value = "{{.Host}}"


interval_times = 1

labels = { cluster = "k8s-cim-prod" }


ignore_metrics = [ "go_*", "process_*", "http_*" ]

ignore_label_keys = []


timeout = "10s"

重启 categraf

bash 复制代码
service  categraf restart
相关推荐
回到原点的码农6 小时前
Spring Data JDBC 详解
java·数据库·spring
gf13211116 小时前
python_查询并删除飞书多维表格中的记录
java·python·飞书
zb200641206 小时前
Spring Boot 实战:轻松实现文件上传与下载功能
java·数据库·spring boot
一勺菠萝丶7 小时前
Flowable + Spring 集成踩坑:流程结束监听器查询历史任务为空 & 获取不到审批意见
java·数据库·spring
jwn9997 小时前
Spring Boot 整合 Keycloak
java·spring boot·后端
宁波阿成7 小时前
OpenClaw 在 Ubuntu 22.04.5 LTS 上的安装与问题处理记录
java·linux·ubuntu·openclaw·龙虾
mldlds7 小时前
SpringBoot详解
java·spring boot·后端
kang_jin7 小时前
Spring Boot 自动配置
java·spring boot·后端
sg_knight7 小时前
如何用 Claude Code 做大型项目重构与架构优化
java·重构·架构·llm·claude·code·claude-code
码不停蹄Zzz7 小时前
C语言——神奇的static
java·c语言·开发语言