【Monitoring】使用Helm和Prometheus Operator安装Prometheus

【前置文章】

【参考】

【安装环境】

  • mac
  • minikube v1.25.2

1. 安装方式选择以及资源说明

1.1 选择Helm chart来安装Prometheus Operator

在Kubernetes中安装Prometheus相关的服务,有两种方式:

  • 方式一:需要编写好deployment.yaml,用来安装Prometheus服务,Grafana服务,Alertmanager服务。并且安装另外必须的一些ConfigMap或secret。
  • 方式二:使用Kubernetes Operator进行安装,优点:方便。
    • 可以手工进行安装Operator
    • 也可以使用Helm chart进行安装Operator --> 本文采用的是这种方式进行安装

注:关于为什么要安装Prometheus Operator,而不是Prometheus本身,是因为Operator可以帮助我们布署、管理、恢复Prometheus(因为Prometheus是有状态的应用,不同于无状态的Java项目(Kubernetes可以自动化管理无状态应用),有状态的应用运维比较麻烦,所以需要特定的Operator来管理。具体可以看文章开头的前置文章。

1.2 资源说明

Prometheus社区有很多charts,都是由Prometheus社区维护的,具体的Git地址:github.com/prometheus-... 我们要安装的是kube-prometheus-stack,地址:github.com/prometheus-...

关于kube-prometheus-stack官网是这么描述的,它包含了Grafana dashboards,使用Prometheus Operator安装Prometheus相关的组件

Installs the kube-prometheus stack, a collection of Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.

2. 使用helm安装Prometheus Operator

2.1 创建monitoring命名空间:

$ kubectl create namespace monitoring

namespace/monitoring created

2.2 添加Prometheus Operator Helm repository:

$ helm repo add prometheus-community prometheus-community.github.io/helm-charts

"prometheus-community" has been added to your repositories

2.3 更新Helm repositories:

$ helm repo update

Hang tight while we grab the latest from your chart repositories... ...Successfully got an update from the "prometheus-community" chart repository Update Complete. ⎈Happy Helming!⎈

2.4 开始安装Prometheus Operator:

$ helm install prometheus-operator prometheus-community/kube-prometheus-stack -n monitoring

注:如果安装中途取消了,再次安装出现Error: INSTALLATION FAILED: cannot re-use a name that is still in use,可以用helm upgrade --install来代替(参考github.com/helm/helm/i...)。

也可以先删除再安装,删除命令:

$ helm -n monitoring delete prometheus-operator

release "prometheus-operator" uninstalled

2.5 下载安装包,helm chart从安装包中安装Prometheus Operator

(如果#2.4成功了,本节可以跳过) 注:如果遇到网络超时间:Error: INSTALLATION FAILED: failed pre-install: timed out waiting for the condition,可以先下载安装文件再进行安装。

下载地址:github.com/prometheus-...,找到kube-prometheus-stack相关的下载。

开始安装:

$ helm install prometheus-operator ./kube-prometheus-stack-58.0.0.tgz -n monitoring

NAME: prometheus-operator

LAST DEPLOYED: Tue Apr 9 22:32:20 2024

NAMESPACE: monitoring

STATUS: deployed

REVISION: 1

NOTES:

kube-prometheus-stack has been installed. Check its status by running: kubectl --namespace monitoring get pods -l "release=prometheus-operator"

Visit github.com/prometheus-... for instructions on how to create & configure Alertmanager and Prometheus instances using the Operator.

3. 验证Prometheus Operator是否安装成功

monitoring namespace下的pod都在Running状态:

$ kubectl get pods -n monitoring

NAME READY STATUS RESTARTS AGE

alertmanager-prometheus-operator-kube-p-alertmanager-0 2/2 Running 0 8m28s

prometheus-operator-grafana-66bd56d448-wlvq8 3/3 Running 0 10m

prometheus-operator-kube-p-operator-5b8b4b9c4f-6psmv 1/1 Running 0 10m

prometheus-operator-kube-state-metrics-67b7949c67-6kd87 1/1 Running 0 10m

prometheus-operator-prometheus-node-exporter-zl6mb 1/1 Running 0 10m

prometheus-prometheus-operator-kube-p-prometheus-0 2/2 Running 0 8m27s

4. 访问Prometheus和Grafana Dashboards

4.1 访问Prometheus dashboard

先查看service的端点:

$ kubectl get service -n monitoring

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE

alertmanager-operated ClusterIP None 9093/TCP,9094/TCP,9094/UDP 9m56s

prometheus-operated ClusterIP None 9090/TCP 9m55s

prometheus-operator-grafana ClusterIP 10.102.94.233 80/TCP 12m

prometheus-operator-kube-p-alertmanager ClusterIP 10.97.70.105 9093/TCP,8080/TCP 12m

prometheus-operator-kube-p-operator ClusterIP 10.111.224.63 443/TCP 12m

prometheus-operator-kube-p-prometheus ClusterIP 10.101.157.134 9090/TCP,8080/TCP 12m

prometheus-operator-kube-state-metrics ClusterIP 10.100.94.162 8080/TCP 12m

prometheus-operator-prometheus-node-exporter ClusterIP 10.100.32.110 9100/TCP 12m

然后使用port-forward进行转发以便kubernetes群体外可以访问内部的service:

$ kubectl port-forward service/prometheus-operator-kube-p-prometheus -n monitoring 9090:9090

Forwarding from 127.0.0.1:9090 -> 9090

Forwarding from [::1]:9090 -> 9090

Prometheus dashboard地址:http://localhost:9090

点击菜单Status --> Configuration,可以查看当前的prometheus.yaml:

点击菜单Status --> Targets,可以查看当前的Targets,即从哪里抓取数据。

点击菜单Status --> Rules,可以查看当前的Rules。

4.2 访问Grafana dashboard

和prometheus service类似,先查询,后转发。

$ kubectl port-forward service/prometheus-operator-grafana -n monitoring 3000:80

Forwarding from 127.0.0.1:3000 -> 3000

Forwarding from [::1]:3000 -> 3000

Grafana dashboard地址:http://localhost:3000,Grafana需要登陆,所以需要拿到grafana的密码,查看secrets:

$ kubectl get secrets -n monitoring

NAME TYPE DATA AGE

prometheus-operator-grafana Opaque 3 16m

查看具体内容:

$ kubectl get secret prometheus-operator-grafana -n monitoring -o yaml

apiVersion: v1

data:

admin-password: cHJvbS1vcGVyYXRvcg==

admin-user: YWRtaW4=

ldap-toml: ""

kind: Secret

<其它略>

复制password,用base64 decode下:

$ echo "cHJvbS1vcGVyYXRvcg==" | base64 -d; echo

prom-operator

拿到明文密码后,用admin/prom-operator登陆Grafana:

进去后,点击菜单Connections,可以看到默认已经配置了Prometheus的数据源:

进去后点击Dashboards菜单,可以看到默认prometheus会抓取kubenetes components的metrics如Pod等,也有node相关的配置:

这里的ip是minikube的ip:

$ minikube ip

192.168.49.2

也可以查看pod相关的metrics:

5. 理解目前的安装

5.1 statefulset,deployment,daemonset

列出所有monitoring下的资源,列出的资源包括:podservicedeployment等:

$ kubectl get all -n monitoring

<其它略>

NAME READY AGE

statefulset.apps/alertmanager-prometheus-operator-kube-p-alertmanager 1/1 17h

statefulset.apps/prometheus-prometheus-operator-kube-p-prometheus 1/1 17h

其中statefulset资源有两个:

  • 其中一个prometheus开头的,是prometheus三个server(即RetrivalStorageHTTP Server),名字中间有operator,表示这个prometheus归operator管理。
  • 另一个alertmanager开头的,顾名思议是alert manager,也是归operator管理。

上接查询结果:

NAME READY UP-TO-DATE AVAILABLE AGE

deployment.apps/prometheus-operator-grafana 1/1 1 1 17h

deployment.apps/prometheus-operator-kube-p-operator 1/1 1 1 17h

deployment.apps/prometheus-operator-kube-state-metrics 1/1 1 1 17h

有三个deployment

  • p-operator是prometheus operator自己的安装清单,通过它创建了Prometheus和Alertmanager的statefulset(也就是上面两个statefulset)。
  • 另一个是grafana相关的安装清单,
  • kube-state-metrics是当前这个Helm chart相关的,它用来抓取k8s当前的cluster本身component相关的metrics,用来测检当前deployment, statefulset, pod的是否健康,这些metrics数据可以在prometheus中被展示出来。

NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE

daemonset.apps/prometheus-operator-prometheus-node-exporter 1 1 1 1 1 kubernetes.io/os=linux 17h

Daemonset会在每个kubernetes的Worker节点上运行。当前的这个prometheus daemenset的作用是会把Worker节点上的数据(比如cpu使用率等)转化为Prometheus的metrics格式的数据。 注:这个Daemonset还需要和pod=prometheus-operator-prometheus-node-exporter-zl6mb合作进行工作。

总结,目前我们安装了Monitoring相关的,还包含了Workder节点的监测、kubernetes components相关的监测。

5.2 configmap, secret

除了上述的资源,还安装了一些configmap,这些配置有些是operator相关的,配置了默认的metrics连接等等。

$ kubectl get configmap -n monitoring

secrets相关的资源,存放Grafana, Prometheus, Operator相关的敏感数据(username, password等):

$ kubectl get secret -n monitoring

5.3 CRDs

可以看到还创建了不少的CRD:

$ kubectl get crd -n monitoring

NAME CREATED AT

alertmanagerconfigs.monitoring.coreos.com 2024-04-09T10:08:00Z

<其它略>

6. 查看具体的配置

导出上述的statefulset的具体描述以及operator deployment的描述:

$ kubectl describe statefulset prometheus-prometheus-operator-kube-p-prometheus -n monitoring > prometheus.yaml

先看statefulset为prometheus的配置,可以看到使用的是v2.51.1版本的prometheus,端口为9090。

yaml 复制代码
      Containers:
       prometheus:
        Image:      quay.io/prometheus/prometheus:v2.51.1
        Port:       9090/TCP

另外还有一些mount目录,比如在rules目录下有一些规则的文件:

csharp 复制代码
    Mounts:
          /etc/prometheus/certs from tls-assets (ro)
          /etc/prometheus/config_out from config-out (ro)
          /etc/prometheus/rules/prometheus-prometheus-operator-kube-p-prometheus-rulefiles-0 from prometheus-prometheus-operator-kube-p-prometheus-rulefiles-0 (rw)

如果prometheus相关的配置有改动,config-reloader负责重新加载这些config,可以看到config通过pod内的目录文件/etc/prometheus/config/prometheus.yaml读取进来的:

arduino 复制代码
    config-reloader:
        Image:      quay.io/prometheus-operator/prometheus-config-reloader:v0.73.0
        Port:       8080/TCP
        Host Port:  0/TCP
        Command:
          /bin/prometheus-config-reloader
        Args:
          --listen-address=:8080
          --reload-url=http://127.0.0.1:9090/-/reload
          --config-file=/etc/prometheus/config/prometheus.yaml.gz

至于prometheus.yaml是怎么被加载到prometheus pod内部目录/etc/prometheus/config的,可以查看config-reloader的Mounts配置:

javascript 复制代码
    Mounts:
      /etc/prometheus/config from config (rw)

可以看到/etc/prometheus/config是从volumn=config加载来的。查看volumn配置,volumn name=config的,type是secret,name=prometheus-prometheus-operator-kube-p-prometheus:

yaml 复制代码
      Volumes:
       config:
        Type:        Secret (a volume populated by a Secret)
        SecretName:  prometheus-prometheus-operator-kube-p-prometheus

通过命令查看secret=prometheus-prometheus-operator-kube-p-prometheus,以yaml格式导出,可以查看该secret相关的配置:

$ kubectl get secret prometheus-prometheus-operator-kube-p-prometheus -o yaml -n monitoring > secret.yaml

同样的,也可以对另外两个主要的配置进行导出查看:

kubectl describe statefulset alertmanager-prometheus-operator-kube-p-alertmanager -n monitoring \> alertmanager.yaml kubectl describe deployment prometheus-operator-kube-p-operator -n monitoring > operator.yaml

相关推荐
Marktowin4 小时前
Mybatis-Plus更新操作时的一个坑
java·后端
赵文宇4 小时前
CNCF Dragonfly 毕业啦!基于P2P的镜像和文件分发系统快速入门,在线体验
后端
程序员爱钓鱼5 小时前
Node.js 编程实战:即时聊天应用 —— WebSocket 实现实时通信
前端·后端·node.js
Libby博仙6 小时前
Spring Boot 条件化注解深度解析
java·spring boot·后端
源代码•宸6 小时前
Golang原理剖析(Map 源码梳理)
经验分享·后端·算法·leetcode·golang·map
小周在成长6 小时前
动态SQL与MyBatis动态SQL最佳实践
后端
瓦尔登湖懒羊羊6 小时前
TCP的自我介绍
后端
小周在成长6 小时前
MyBatis 动态SQL学习
后端
子非鱼9216 小时前
SpringBoot快速上手
java·spring boot·后端