`k3s`环境和`docker`环境下的`grafana`+`prometheus`+`loki`的使用

摘要:本文主要介绍k3s环境和docker环境下的grafana+prometheus+loki的使用,着重介绍报警配置和日志接入。

k3s

我使用docker作为容器的,安装命令curl -sfL https://rancher-mirror.rancher.cn/k3s/k3s-install.sh | INSTALL_K3S_MIRROR=cn sh -s - --docker

Grafana

Prometheus

安装教程

本案例采用的的是kube-prometheus

安装

ini 复制代码
# Create the namespace and CRDs, and then wait for them to be available before creating the remaining resources
# Note that due to some CRD size we are using kubectl server-side apply feature which is generally available since kubernetes 1.22.
# If you are using previous kubernetes versions this feature may not be available and you would need to use kubectl create instead.
kubectl apply --server-side -f manifests/setup
kubectl wait \
	--for condition=Established \
	--all CustomResourceDefinition \
	--namespace=monitoring
kubectl apply -f manifests/

针对registry.k8s.io镜像下载不了

bash 复制代码
docker pull bitnami/kube-state-metrics:2.13.0
docker tag bitnami/kube-state-metrics:2.13.0 registry.k8s.io/kube-state-metrics/kube-state-metrics:v2.13.0

docker pull v5cn/prometheus-adapter:v0.12.0
docker tag v5cn/prometheus-adapter:v0.12.0 registry.k8s.io/prometheus-adapter/prometheus-adapter:v0.12.0

开放Grafanasvc端口访问

  • 修改svcNodePort
  • 修改networkpolicy的入站流量为所有修改ingress如下
yaml 复制代码
spec:
  egress:
  - {}
  ingress:
  - {}
  podSelector:
    matchLabels:
      app.kubernetes.io/component: grafana
      app.kubernetes.io/name: grafana
      app.kubernetes.io/part-of: kube-prometheus
  policyTypes:
  - Egress
  - Ingress

监控springboot项目

springboot.yaml

yaml 复制代码
apiVersion: v1
kind: Namespace
metadata:
  name: k8s-springboot
---
apiVersion: v1
kind: Service
metadata:
  name: k8s-springboot-demo
  namespace: k8s-springboot
  lables:
    app: k8s-springboot-demo
spec:
  type: ClusterIP
  ports:
    - port: 8080
      targetPort: 8080
      protocol: TCP
      name: web
  selector:
    app: k8s-springboot-demo
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: k8s-springboot-demo
  namespace: k8s-springboot
spec:
  selector:
    matchLabels:
      app: k8s-springboot-demo
  replicas: 1
  template:
    metadata:
      labels:
        app: k8s-springboot-demo
    spec:
      containers:
        - name: k8s-springboot-demo
          image: huzhihui/springboot:1.0.0
          ports:
            - containerPort: 8080
          livenessProbe:
            httpGet:
              path: /
              port: 8080
            exec:
              command:
                - cat
                - /tmp/healthy
            tcpSocket:
              port: 80
            initialDelaySeconds: 30
            periodSeconds: 10
            successThreshold: 1
            failureThreshold: 3
            timeoutSeconds: 1
          readinessProbe:
            httpGet:
              path: /
              port: 8080
            initialDelaySeconds: 30
            periodSeconds: 10
            successThreshold: 1
            failureThreshold: 3
            timeoutSeconds: 1

ServiceMonitor

选择svc

yaml 复制代码
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: k8s-springboot-demo
  namespace: k8s-springboot
  labels:
    team: k8s-springboot-demo
spec:
  selector:
    matchLabels:
      app: k8s-springboot-demo
  endpoints:
  - port: web

Prometheus注册端点

如果监控接不是/metrics,则需要配置ScrapeConfig

yaml 复制代码
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
  name: prometheus
  namespace: k8s-springboot
spec:
  serviceAccountName: prometheus
  podMonitorSelector:
    matchLabels:
      team: k8s-springboot-demo

监控docker机器的运行情况部署案例

部署配置

Grafanadocker-compose.yml

yaml 复制代码
version: "3"
services:
  grafana:
    image: grafana/grafana:latest
    container_name: grafana
    ports:
      - "3000:3000"
    networks:
      - default
    volumes:
    - ./data:/var/lib/grafana

networks:
  default:
    external:
      name: nisec

prometheus.yml

yaml 复制代码
scrape_configs:
# Prometheus
- job_name: 'prometheus'
  static_configs:
    - targets: ['localhost:9090']
# cadvisor
- job_name: cadvisor
  scrape_interval: 5s
  static_configs:
    - targets:
        - cadvisor:8080
# node-exporter
- job_name: node-exporter
  scrape_interval: 5s
  static_configs:
    - targets:
        - node-exporter:9100
# java jvm
- job_name: spring-boot
  scrape_interval: 5s
  metrics_path: '/actuator/prometheus'
  static_configs:
    - targets:
        - '192.168.137.2:8081'

docker-compose.yml

挂载的./data目录分配 777权限

yaml 复制代码
version: '3.2'
services:
  prometheus:
    image: prom/prometheus:latest
    container_name: prometheus
    networks:
      - default
    ports:
      - 9090:9090
    command:
      - --config.file=/etc/prometheus/prometheus.yml
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml:ro
      - ./data:/prometheus
    depends_on:
      - cadvisor
  node-exporter:
    image: prom/node-exporter:latest
    container_name: node-exporter
    expose:
      - 9100
  cadvisor:
    image: gcr.io/cadvisor/cadvisor:latest
    container_name: cadvisor
    networks:
      - default
    ports:
      - 8080:8080
    volumes:
      - /:/rootfs:ro
      - /var/run:/var/run:rw
      - /sys:/sys:ro
      - /var/lib/docker/:/var/lib/docker:ro
networks:
  default:
    external:
      name: nisec

启动

docker compose up -d

使用的Grafana面板

  • 配置数据源

解决rootfs硬盘使用读取不到问题,只取第一行

kotlin 复制代码
100 - ((topk(1,node_filesystem_avail_bytes{instance="$node",job="$job",mountpoint=~"/.*",fstype!="rootfs"}) * 100) / topk(1,node_filesystem_size_bytes{instance="$node",job="$job",mountpoint=~"/.*",fstype!="rootfs"}))

报警配置

  • 钉钉添加群机器人
  • 配置模板

直接复制即可

sql 复制代码
{{ define "custom.alerts" -}}
{{ len .Alerts }} alert(s)
{{ range .Alerts -}}
  {{ template "alert.summary_and_description" . -}}
{{ end -}}
{{ end -}}
{{ define "alert.summary_and_description" }}

  --------------------

  Summary: {{.Annotations.summary}}

  Status: {{ .Status }}

  Description: {{.Annotations.description}}

  Detail: {{.Values.A}}
  
  StartsAt: {{.StartsAt  | tz "Asia/Chongqing" }}

  endsAt: {{.EndsAt  | tz "Asia/Chongqing" }}
{{ end -}}
  • 配置grafana钉钉推送

  • 新建通知策略

  • 通知策略详细配置

下面的配置开启了标签匹配,只有匹配了标签的才走这个配置。

  • 配置说明
配置项 详细解析
Group wait 等待几秒发送,如果配置10S则表示10S内的消息会合并为一个消息到第10S才一起发送,配置0S表示立即发送
Group interval 表示【从报警状态恢复为正常状态】或者【正常状态变为报警状态】的消息发送间隔时长;如果配置5m,表示如果01分的时候发送了报警,02分恢复正常(不会立即发送会等待5分钟),等到06分发送,如果是07分恢复正常,则会11分才发送,它的间隔是你配置的时间的倍数区间
Repeat interval 重复的报警发送时间间隔,如果一个报警一直都在报警而没有恢复正常则你配置的这个时间区间才会重新发送(实测还会加上Group interval的时间间隔)

报警配置案例

我使用的mysql的表接入的,方便改数据

  • 输入查询条件
  • 配置报警条件

如果值大于了40就报警

  • 配置触发计算频率

我配置的是开发环境下的每隔10S触发一次;Pending period这个参数表示多长时间内的数据参与计算达到你的报警条件,配置0S表示计算的结果马上进行报警发送,当一个报警被触发时,系统会进入Pending Period。如果在观察期内报警状态持续存在,系统会触发真正的报警并发送通知;如果在观察期内报警状态不满足条件,系统会取消监控状态‌

loki日志接入Grafana

下面给出我本地的简单案例

loki-config.yaml

yaml 复制代码
# This is a complete configuration to deploy Loki backed by the filesystem.
# The index will be shipped to the storage via tsdb-shipper.

auth_enabled: false

server:
  http_listen_port: 3100

common:
  ring:
    instance_addr: 127.0.0.1
    kvstore:
      store: inmemory
  replication_factor: 1
  path_prefix: /tmp/loki

schema_config:
  configs:
  - from: 2020-05-15
    store: tsdb
    object_store: filesystem
    schema: v13
    index:
      prefix: index_
      period: 24h

storage_config:
  filesystem:
    directory: /tmp/loki/chunks

promtail-config.yaml

yaml 复制代码
server:
  http_listen_port: 9080
  grpc_listen_port: 0

positions:
  filename: /tmp/positions.yaml

clients:
  - url: http://loki:3100/loki/api/v1/push

scrape_configs:
- job_name: system
  static_configs:
  - targets:
      - localhost
    labels:
      job: varlogs
      __path__: /var/log/*log
- job_name: spring-boot
  static_configs:
  - targets:
      - localhost
    labels:
      job: static-server
      __path__: /var/log/static-server/*log

docker-compose.yml

yaml 复制代码
version: "3"

networks:
  loki:
    external:
      name: nisec
services:
  loki:
    image: grafana/loki:3.2.1
    ports:
      - "3100:3100"
    volumes:
      - ./loki-config.yaml:/mnt/config/loki-config.yaml
    command: -config.file=/mnt/config/loki-config.yaml
    networks:
      - loki

  promtail:
    image: grafana/promtail:3.2.1
    volumes:
      - /var/log:/var/log
      - ./promtail-config.yaml:/mnt/config/promtail-config.yaml
    command: -config.file=/mnt/config/promtail-config.yaml
    depends_on:
      - loki
    networks:
      - loki

使用说明

相关推荐
chengxuyuan666662 小时前
如何创建一个基本的Spring Boot应用程序
java·spring boot·后端
techdashen4 小时前
竞争检测、固件、生产级 Go
开发语言·后端·golang
阿芯爱编程8 小时前
最大子数组的值
后端·算法·面试
水w9 小时前
微服务之间的相互调用的几种常见实现方式对比 2
java·开发语言·后端·微服务·架构
潘多编程10 小时前
SpringBoot 3.2:CRaC技术助力启动速度飞跃
java·spring boot·后端
2401_8465359510 小时前
Scala的泛型类和泛型特质
开发语言·后端·scala
刘翔在线犯法10 小时前
Scala的导入
开发语言·后端·scala
睎zyl10 小时前
scala的泛型
开发语言·后端·scala
牛奶10 小时前
SQL学习-增删改数据
前端·后端·mysql
cloudstudio_AI应用11 小时前
编写 Java 单元测试最佳实践
前端·后端