n9e categraf k8s监控配置 -kube-state-metrics

1、k8s集群中创建命名空间/角色/绑定关系/容器-kube-state-metrics

下载镜像 kube-state-metrics-v2.5.0.tar

bash 复制代码
docker load -i kube-state-metrics-v2.5.0.tar
docker images |grep kube-state-metrics
kubectl create namespace z-monitor
vi   kube-state-metrics.yaml
bash 复制代码
#Create by:ember.zhang
#为什么不问问神奇的海螺呢丶
apiVersion: v1
kind: ServiceAccount
metadata:
  name: kube-state-metrics
  namespace: z-monitor  # 命名空间 
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: kube-state-metrics
rules:
  - apiGroups: [""]
    resources:
      - configmaps
      - secrets
      - nodes
      - pods
      - services
      - resourcequotas
      - replicationcontrollers
      - limitranges
      - persistentvolumeclaims
      - persistentvolumes
      - namespaces
      - endpoints
    verbs: ["list", "watch"]
  - apiGroups: ["apps"]
    resources:
      - statefulsets
      - daemonsets
      - deployments
      - replicasets
    verbs: ["list", "watch"]
  - apiGroups: ["batch"]
    resources:
      - cronjobs
      - jobs
    verbs: ["list", "watch"]
  - apiGroups: ["autoscaling"]
    resources:
      - horizontalpodautoscalers
    verbs: ["list", "watch"]
  - apiGroups: ["policy"]
    resources:
      - poddisruptionbudgets
    verbs: ["list", "watch"]
  - apiGroups: ["certificates.k8s.io"]
    resources:
      - certificatesigningrequests
    verbs: ["list", "watch"]
  - apiGroups: ["storage.k8s.io"]
    resources:
      - storageclasses
      - volumeattachments
    verbs: ["list", "watch"]
  - apiGroups: ["apiextensions.k8s.io"]
    resources:
      - customresourcedefinitions
    verbs: ["list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: kube-state-metrics
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: kube-state-metrics
subjects:
  - kind: ServiceAccount
    name: kube-state-metrics
    namespace: z-monitor  # 命名空间
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: kube-state-metrics
  namespace: z-monitor  # 命名空间
  labels:
    app: kube-state-metrics
spec:
  replicas: 1  # 副本数
  selector:
    matchLabels:
      app: kube-state-metrics
  template:
    metadata:
      labels:
        app: kube-state-metrics
    spec:
      serviceAccountName: kube-state-metrics
      containers:
        - name: kube-state-metrics
          image: registry.k8s.io/kube-state-metrics/kube-state-metrics:v2.5.0  # 镜像版本
          imagePullPolicy: IfNotPresent  # 强制使用本地镜像
          ports:
            - name: http-metrics
              containerPort: 8080
            - name: telemetry
              containerPort: 8081
          resources:
            # 资源限制
            limits:
              cpu: 200m
              memory: 256Mi
            requests:
              cpu: 100m
              memory: 128Mi
          readinessProbe:
            httpGet:
              path: /healthz
              port: 8080
            initialDelaySeconds: 5
            timeoutSeconds: 5
          livenessProbe:
            httpGet:
              path: /healthz
              port: 8080
            initialDelaySeconds: 15
            timeoutSeconds: 5
---
apiVersion: v1
kind: Service
metadata:
  name: kube-state-metrics
  namespace: z-monitor  # 命名空间
  labels:
    app: kube-state-metrics
spec:
  type: NodePort  
  ports:
    - name: http-metrics
      port: 8080
      targetPort: 8080
      nodePort: 30880
      protocol: TCP
    - name: telemetry
      port: 8081
      targetPort: 8081
      nodePort: 30881
      protocol: TCP
  selector:
    app: kube-state-metrics
bash 复制代码
kubectl apply -f z-kube-state-metrics.yaml

2、categraf/conf/input.prometheus/ 增加配置文件

bash 复制代码
vi cim-prod.toml
bash 复制代码
interval = 15

[[instances]]
urls = [
    "http://10.12.8.xx:30880/metrics"
]

url_label_key = "instance"
url_label_value = "{{.Host}}"


interval_times = 1

labels = { cluster = "k8s-cim-prod" }


ignore_metrics = [ "go_*", "process_*", "http_*" ]

ignore_label_keys = []


timeout = "10s"

重启 categraf

bash 复制代码
service  categraf restart
相关推荐
一定要AK1 天前
Spring 核心容器从入门到精通
java·后端·spring
RInk7oBjo1 天前
spring boot3--自动配置与手动配置
java·spring boot·后端
最初的↘那颗心1 天前
LangChain4j核心能力:AiService、Prompt注解与结构化输出实战
java·大模型·结构化输出·langchain4j·aiservice
lixia0417mul21 天前
简单的RAG知识库问答
java
云烟成雨TD1 天前
Spring AI 1.x 系列【25】结构化输出案例演示
java·人工智能·spring
鱼鳞_1 天前
Java学习笔记_Day23(HashMap)
java·笔记·学习
斯普信云原生组1 天前
Docker 开源软件应急处理方案及操作手册——Docker 服务启动故障处理
运维·docker·容器
hua_ban_yu1 天前
新版本 idea 如何设置热部署
java·ide·intellij-idea
SimonKing1 天前
免费!不限量!用opencode接入英伟达(NVIDIA)大模型,轻松打造你的 AI 编程助手
java·后端·程序员