通过redis_exporter监控redis cluster

环境说明:

现在有一套redis cluster,部署是3主机6实例架构部署。需要采集对应的指标,满足异常监控告警,性能分析所需。

环境准备

以下环境需要提前部署完成。

redis cluser

prometheus

alertmanager

grafna

redis_exporter部署

我们部署采用docker composer 进行安装。

采用的redis_exporter为:https://github.com/oliver006/redis_exporter

bash 复制代码
  redis-exporter:
    image: docker.m.daocloud.io/oliver006/redis_exporter:v1.74.0-alpine
    command:
      - '--redis.addr=redis://redisIP:7001'
      - '--redis.password=redisPassword'
      - '--is-cluster'
    ports:
      - "9121:9121"

上面参数,只需要指定--is-cluster,然后指明集群中一个节点,即可获取所有节点的数据。

prometheus采集配置:

添加prometheus的监控项:

yml 复制代码
  - job_name: 'redis_sjzt_prod'
    http_sd_configs:
      - url: http://redisExporterIP:9121/discover-cluster-nodes
        refresh_interval: 10m
    metrics_path: /scrape
    relabel_configs:
      - source_labels: [__address__]
        target_label: __param_target
      - source_labels: [__param_target]
        target_label: instance
      - target_label: __address__
        replacement: redisExporter:9121

指标查看:

可以看到 prometheus的target中已经存在对应的采集项,并且有集群的所有节点。

大屏展示:

监控模板:https://grafana.com/grafana/dashboards/763-redis-dashboard-for-prometheus-redis-exporter-1-x/

下载后直接导入,选择对应的数据源即可。

告警:

现在创建对应的报警规则,实现异常时通知到alertmanager。

使用的告警规则为:https://raw.githubusercontent.com/samber/awesome-prometheus-alerts/master/dist/rules/redis/oliver006-redis-exporter.yml

下载后,加入到prometheus中。

但是因为我们是集群,所以需要修改一些报警规则实现。删除两个不适用的报警规则RedisTooManyMasters和RedisDisconnectedSlaves 。修改后内容如下:

vim redis.yml

yml 复制代码
groups:
- name: Oliver006RedisExporter
  rules:
    - alert: RedisDown
      expr: 'redis_up == 0'
      for: 0m
      labels:
        severity: critical
      annotations:
        summary: Redis down (instance {{ $labels.instance }})
        description: "Redis instance is down\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
    - alert: RedisMissingMaster
      expr: '(count(redis_instance_info{role="master"}) or vector(0)) < 1'
      for: 0m
      labels:
        severity: critical
      annotations:
        summary: Redis missing master (instance {{ $labels.instance }})
        description: "Redis cluster has no node marked as master.\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
    - alert: RedisReplicationBroken
      expr: 'delta(redis_connected_slaves[1m]) < 0'
      for: 0m
      labels:
        severity: critical
      annotations:
        summary: Redis replication broken (instance {{ $labels.instance }})
        description: "Redis instance lost a slave\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
    - alert: RedisClusterFlapping
      expr: 'changes(redis_connected_slaves[1m]) > 1'
      for: 2m
      labels:
        severity: critical
      annotations:
        summary: Redis cluster flapping (instance {{ $labels.instance }})
        description: "Changes have been detected in Redis replica connection. This can occur when replica nodes lose connection to the master and reconnect (a.k.a flapping).\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
    - alert: RedisMissingBackup
      expr: 'time() - redis_rdb_last_save_timestamp_seconds > 60 * 60 * 24'
      for: 0m
      labels:
        severity: critical
      annotations:
        summary: Redis missing backup (instance {{ $labels.instance }})
        description: "Redis has not been backuped for 24 hours\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
    - alert: RedisOutOfSystemMemory
      expr: 'redis_memory_used_bytes / redis_total_system_memory_bytes * 100 > 90'
      for: 2m
      labels:
        severity: warning
      annotations:
        summary: Redis out of system memory (instance {{ $labels.instance }})
        description: "Redis is running out of system memory (> 90%)\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
    - alert: RedisOutOfConfiguredMaxmemory
      expr: 'redis_memory_used_bytes / redis_memory_max_bytes * 100 > 90 and on(instance) redis_memory_max_bytes > 0'
      for: 2m
      labels:
        severity: warning
      annotations:
        summary: Redis out of configured maxmemory (instance {{ $labels.instance }})
        description: "Redis is running out of configured maxmemory (> 90%)\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
    - alert: RedisTooManyConnections
      expr: 'redis_connected_clients / redis_config_maxclients * 100 > 90'
      for: 2m
      labels:
        severity: warning
      annotations:
        summary: Redis too many connections (instance {{ $labels.instance }})
        description: "Redis is running out of connections (> 90% used)\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
    - alert: RedisNotEnoughConnections
      expr: 'redis_connected_clients < 5'
      for: 2m
      labels:
        severity: warning
      annotations:
        summary: Redis not enough connections (instance {{ $labels.instance }})
        description: "Redis instance should have more connections (> 5)\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
    - alert: RedisRejectedConnections
      expr: 'increase(redis_rejected_connections_total[1m]) > 0'
      for: 0m
      labels:
        severity: critical
      annotations:
        summary: Redis rejected connections (instance {{ $labels.instance }})
        description: "Some connections to Redis has been rejected\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"

重新加载prometheus的配置

curl -X POST http://localhost:9090/-/reload

检查对应的报警项是否已经添加进去了。访问prometheus 点击Alerts。进行查看如下所示:

说明:监控指标需要按照实际项目需要进行仔细考虑。以上只是参考。

相关推荐
鲲志说6 小时前
电子证照系统国产化改造实践:从MongoDB到金仓数据库的平滑迁移与性能优化
大数据·数据库·mongodb·性能优化·数据库开发·数据库架构·金仓数据库
范纹杉想快点毕业7 小时前
单片机开发中的队列数据结构详解,队列数据结构在单片机软件开发中的应用详解,C语言
c语言·数据库·stm32·单片机·嵌入式硬件·mongodb·fpga开发
William_cl8 小时前
【连载2】 MySQL 事务原理详解
数据库·mysql
java水泥工8 小时前
师生健康信息管理系统|基于SpringBoot和Vue的师生健康信息管理系统(源码+数据库+文档)
数据库·vue.js·spring boot
chirrupy_hamal9 小时前
PostgreSQL WAL 日志发展史 - pg7
数据库·postgresql
五颜六色的池9 小时前
my sql 常用函数及语句的执行顺序
数据库·sql
Gold Steps.10 小时前
从 “T+1” 到 “秒级”:MySQL+Flink+Doris 构建实时数据分析全链路
大数据·数据库·数据分析
花北城10 小时前
【MySQL】Oracle与MySQL,跨库数据转储
数据库·mysql·oracle
一條狗10 小时前
学习日报 20250929|数据库与缓存一致性策略的选择
redis·mysql·kafka
没有bug.的程序员10 小时前
MySQL 配置调优参数:从基础到生产级优化指南
java·数据库·mysql·优化·mysql配置调优