部署promethues采集kubelet数据报错:server returned HTTP status 403 Forbidden

背景

笔者尝试部署手动部署promethues去采集kubelet的node节点数据信息时报错

笔者的promethus的配置文件和promthues的clusterrole配置如下所示:

bash 复制代码
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: prometheus
rules:
- apiGroups: [""]
  resources:
  - nodes
  - nodes/proxy
  # - nodes/metrics
  - services
  - endpoints
  - pods
  verbs: ["get", "list", "watch"]
- apiGroups:
  - extensions
  resources:
  - ingresses
  verbs: ["get", "list", "watch"]
- nonResourceURLs: ["/metrics"]
  verbs: ["get"]
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: prometheus
  namespace: default
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: prometheus
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: prometheus
subjects:
- kind: ServiceAccount
  name: prometheus
  namespace: default


---
apiVersion: v1
data:
  prometheus.yml: |-
    global:
      scrape_interval:     15s 
      evaluation_interval: 15s
    scrape_configs:

    - job_name: 'kubernetes-nodes'
      tls_config:
        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      kubernetes_sd_configs:
      - role: node

    - job_name: 'kubernetes-service'
      tls_config:
        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      kubernetes_sd_configs:
      - role: service

    - job_name: 'kubernetes-endpoints'
      tls_config:
        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      kubernetes_sd_configs:
      - role: endpoints

    - job_name: 'kubernetes-ingress'
      tls_config:
        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      kubernetes_sd_configs:
      - role: ingress

    - job_name: 'kubernetes-pods'
      tls_config:
        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      kubernetes_sd_configs:
      - role: pod
    - job_name: 'kubernetes-kubelet'
      scheme: https
      tls_config:
        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        insecure_skip_verify: true
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      kubernetes_sd_configs:
        - role: node
      relabel_configs:
        - action: labelmap
          regex: __meta_kubernetes_node_label_(.+)

kind: ConfigMap
metadata:
  name: prometheus-config

---
apiVersion: v1
kind: "Service"
metadata:
  name: prometheus
  labels:
    name: prometheus
spec:
  ports:
  - name: prometheus
    protocol: TCP
    port: 9090
    targetPort: 9090
  selector:
    app: prometheus
  type: NodePort
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: prometheus
  labels:
    app: prometheus
spec:
  replicas: 1
  selector:
    matchLabels:
      app: prometheus
  template:
    metadata:
      labels:
        app: prometheus
    spec:
      serviceAccountName: prometheus
      serviceAccount: prometheus
      containers:
        - name: prometheus
          image: prom/prometheus:v2.2.1
          command:
            - "/bin/prometheus"
          args:
            - "--config.file=/etc/prometheus/prometheus.yml"
          ports:
            - containerPort: 9090
          volumeMounts:
            - mountPath: "/etc/prometheus"
              name: prometheus-config
      volumes:
        - name: prometheus-config
          configMap:
            name: prometheus-config

解决措施

笔者已经在promethues的配置文件中添加了insecure_skip_verify: true选项,这个选项跳过了tls的校验。这时候报错server returned HTTP status 403 Forbidden很显然是接口权限问题。

问题一:https://10.101.12.132:10250/metrics这个接口是做什么

https://10.101.12.132:10250/metrics 是一个特定的路径,通常用于获取 Kubernetes

集群中的节点(Node)的指标数据。也就是说,它提供了节点级别的监控指标

问题二:这个接口主要由什么资源进行权限控制

在 Kubernetes 中,https://10.101.12.132:10250/metrics 接口相关的权限通常由 ClusterRole 或 ClusterRoleBinding 来管理。这两个角色资源对于授予集群范围的权限非常有用。ClusterRole 定义了一组权限,它们可以在整个集群中使用。ClusterRoleBinding 则用于将角色绑定到用户、组或其他实体上,以授予这些实体访问相应权限的能力。要授予访问 https://10.101.12.132:10250/metrics 接口的权限,可能需要使用以下 ClusterRole 和 ClusterRoleBinding 示例作为参考:

bash 复制代码
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: node-metrics-reader
rules:
  - apiGroups: [""]
    resources: ["nodes/metrics"]
    verbs: ["get", "list"]

---

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: node-metrics-reader-binding
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: node-metrics-reader
subjects:
- kind: User
  name: <username>  # 替换为具体的用户名或组名

我们尝试修改前文中的promethues的ClusterRole中的配置,删除前文中的注释,添加 - nodes/metrics资源的可操作权限,问题解决

参考文章: 部署了 prometheus, 在 target 中显示 cadvisor 与 nodes 的状态都是 down

相关推荐
ZzzZZzzzZZZzzzz…1 分钟前
Docker + K8s集群搭建实战:1 Master+2 Node,含Harbor私有仓库与软路由
docker·云原生·容器·kubernetes·容器编排·集群部署·cri-dockerd
xier_ran1 小时前
【infra之路】模块三:Kubernetes (下) — 阶段一毕业项目:在集群里跑 PyTorch 训练
pytorch·容器·kubernetes
Waay1 小时前
K8s新手实操|emptyDir卷超详细实战(附完整命令+核心理解)
云原生·容器·kubernetes
liux35281 小时前
K8s 核心接口:CNI、CSI、CRI、LB 一篇讲透
云原生·容器·kubernetes
Devin~Y2 小时前
从内容社区到AIGC客服:Spring Boot、Redis、Kafka、K8s、RAG的三轮大厂Java面试对话(附标准答案)
java·spring boot·redis·spring cloud·kafka·kubernetes·micrometer
江南风月3 小时前
WGCLOUD监控系统的Restful Http接口一览
运维·zabbix·运维开发·prometheus
IT策士3 小时前
第25篇 k8s之Deployment 基础:声明式管理与副本控制
云原生·容器·kubernetes
IT策士3 小时前
第 26 篇 k8s之Deployment 进阶:滚动更新、回滚与暂停
云原生·容器·kubernetes
张忠琳4 小时前
【kubernetes v1.21】(kubelet 2)容器运行时与CRI
云原生·架构·kubernetes·kubelet
张忠琳4 小时前
【kubernetes v1.21】(kubelet 3)PLEG、健康检查、Eviction 与状态管理
云原生·架构·kubernetes·kubelet