部署promethues采集kubelet数据报错:server returned HTTP status 403 Forbidden

背景

笔者尝试部署手动部署promethues去采集kubelet的node节点数据信息时报错

笔者的promethus的配置文件和promthues的clusterrole配置如下所示:

bash 复制代码
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: prometheus
rules:
- apiGroups: [""]
  resources:
  - nodes
  - nodes/proxy
  # - nodes/metrics
  - services
  - endpoints
  - pods
  verbs: ["get", "list", "watch"]
- apiGroups:
  - extensions
  resources:
  - ingresses
  verbs: ["get", "list", "watch"]
- nonResourceURLs: ["/metrics"]
  verbs: ["get"]
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: prometheus
  namespace: default
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: prometheus
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: prometheus
subjects:
- kind: ServiceAccount
  name: prometheus
  namespace: default


---
apiVersion: v1
data:
  prometheus.yml: |-
    global:
      scrape_interval:     15s 
      evaluation_interval: 15s
    scrape_configs:

    - job_name: 'kubernetes-nodes'
      tls_config:
        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      kubernetes_sd_configs:
      - role: node

    - job_name: 'kubernetes-service'
      tls_config:
        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      kubernetes_sd_configs:
      - role: service

    - job_name: 'kubernetes-endpoints'
      tls_config:
        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      kubernetes_sd_configs:
      - role: endpoints

    - job_name: 'kubernetes-ingress'
      tls_config:
        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      kubernetes_sd_configs:
      - role: ingress

    - job_name: 'kubernetes-pods'
      tls_config:
        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      kubernetes_sd_configs:
      - role: pod
    - job_name: 'kubernetes-kubelet'
      scheme: https
      tls_config:
        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        insecure_skip_verify: true
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      kubernetes_sd_configs:
        - role: node
      relabel_configs:
        - action: labelmap
          regex: __meta_kubernetes_node_label_(.+)

kind: ConfigMap
metadata:
  name: prometheus-config

---
apiVersion: v1
kind: "Service"
metadata:
  name: prometheus
  labels:
    name: prometheus
spec:
  ports:
  - name: prometheus
    protocol: TCP
    port: 9090
    targetPort: 9090
  selector:
    app: prometheus
  type: NodePort
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: prometheus
  labels:
    app: prometheus
spec:
  replicas: 1
  selector:
    matchLabels:
      app: prometheus
  template:
    metadata:
      labels:
        app: prometheus
    spec:
      serviceAccountName: prometheus
      serviceAccount: prometheus
      containers:
        - name: prometheus
          image: prom/prometheus:v2.2.1
          command:
            - "/bin/prometheus"
          args:
            - "--config.file=/etc/prometheus/prometheus.yml"
          ports:
            - containerPort: 9090
          volumeMounts:
            - mountPath: "/etc/prometheus"
              name: prometheus-config
      volumes:
        - name: prometheus-config
          configMap:
            name: prometheus-config

解决措施

笔者已经在promethues的配置文件中添加了insecure_skip_verify: true选项,这个选项跳过了tls的校验。这时候报错server returned HTTP status 403 Forbidden很显然是接口权限问题。

问题一:https://10.101.12.132:10250/metrics这个接口是做什么

https://10.101.12.132:10250/metrics 是一个特定的路径,通常用于获取 Kubernetes

集群中的节点(Node)的指标数据。也就是说,它提供了节点级别的监控指标

问题二:这个接口主要由什么资源进行权限控制

在 Kubernetes 中,https://10.101.12.132:10250/metrics 接口相关的权限通常由 ClusterRole 或 ClusterRoleBinding 来管理。这两个角色资源对于授予集群范围的权限非常有用。ClusterRole 定义了一组权限,它们可以在整个集群中使用。ClusterRoleBinding 则用于将角色绑定到用户、组或其他实体上,以授予这些实体访问相应权限的能力。要授予访问 https://10.101.12.132:10250/metrics 接口的权限,可能需要使用以下 ClusterRole 和 ClusterRoleBinding 示例作为参考:

bash 复制代码
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: node-metrics-reader
rules:
  - apiGroups: [""]
    resources: ["nodes/metrics"]
    verbs: ["get", "list"]

---

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: node-metrics-reader-binding
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: node-metrics-reader
subjects:
- kind: User
  name: <username>  # 替换为具体的用户名或组名

我们尝试修改前文中的promethues的ClusterRole中的配置,删除前文中的注释,添加 - nodes/metrics资源的可操作权限,问题解决

参考文章: 部署了 prometheus, 在 target 中显示 cadvisor 与 nodes 的状态都是 down

相关推荐
hwj运维之路5 分钟前
超详细ubuntu22.04部署k8s1.28高可用(二)【结合ingress实现业务高可用】
运维·云原生·容器·kubernetes
切糕师学AI16 分钟前
.NET Core Web 中的健康检查端点(Health Check Endpoint)
前端·kubernetes·.netcore
Cyber4K4 小时前
【Kubernetes专项】K8s 控制器 DaemonSet 从入门到企业实战应用
云原生·容器·kubernetes
切糕师学AI4 小时前
RKE(Rancher Kubernetes Engine) 是什么?
云原生·容器·kubernetes·rancher
码农小卡拉5 小时前
Prometheus 监控 SpringBoot 应用完整教程
spring boot·后端·grafana·prometheus
牛奶咖啡135 小时前
Prometheus+Grafana构建云原生分布式监控系统(十五)_Prometheus中PromQL使用(二)
云原生·prometheus·集合运算·对查询结果排序·直方图原理·统计掉线的实例·检查节点或指标是否存在
龙飞055 小时前
Kubernetes 排障实战:PVC 一直 Pending 的原因与解决方案
运维·学习·云原生·容器·kubernetes
岁岁种桃花儿6 小时前
流量入口Nginx动态发现K8s Ingress Controller实操指南
nginx·架构·kubernetes
冗量6 小时前
Kubernetes (K8s) 基础知识、部署与运维指南
运维·容器·kubernetes
青衫客367 小时前
从 TLS 到 Kubernetes PKI:一条证书链如何支撑整个集群安全(问题合集)
容器·kubernetes·k8s·tls