16-Zabbix监控配置详解
本文档详细介绍Zabbix监控系统的部署和配置,实现对3节点Docker集群的全面监控。
概述
Zabbix是一个企业级开源监控解决方案,支持:
-
主机和容器监控
-
网络设备监控
-
应用程序监控
-
告警和通知
架构设计
┌─────────────────────────────────────────────────────────────────┐
│ manage-net (172.20.5.0/24) │
│ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ Zabbix Server │ │
│ │ 172.20.5.31:10051 │ │
│ └────────────────────────┬────────────────────────────────┘ │
│ │ │
│ ┌────────────────────────┴────────────────────────────────┐ │
│ │ Zabbix Web │ │
│ │ 172.20.5.32:8080 │ │
│ │ (Apache + PHP) │ │
│ └────────────────────────┬────────────────────────────────┘ │
│ │ │
│ ┌────────────────────────┴────────────────────────────────┐ │
│ │ Zabbix MySQL │ │
│ │ 172.20.5.33:3306 │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │Agent-Node1 │ │Agent-Node2 │ │Agent-Node3 │ │
│ │172.20.5.41 │ │172.20.5.42 │ │172.20.5.43 │ │
│ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │
└─────────┼──────────────────┼──────────────────┼─────────────────┘
│ │ │
监控Node1 监控Node2 监控Node3
IP规划
| 组件 | IP地址 | 节点 | 端口 | 说明 |
|---|---|---|---|---|
| Zabbix Server | 172.20.5.31 | Node3 | 10051 | Zabbix主服务 |
| Zabbix Web | 172.20.5.32 | Node3 | 8080 | Web界面 |
| Zabbix MySQL | 172.20.5.33 | Node3 | 3306 | 数据库 |
| Zabbix Agent | 172.20.5.41 | Node1 | 10050 | Agent2 |
| Zabbix Agent | 172.20.5.42 | Node2 | 10050 | Agent2 |
| Zabbix Agent | 172.20.5.43 | Node3 | 10050 | Agent2 |
部署步骤
步骤1:创建配置目录
在所有节点执行:
mkdir -p /opt/cluster-deploy/config/{zabbix,zabbix-mysql}
步骤2:创建Zabbix MySQL配置
在Node3执行:
cat > /opt/cluster-deploy/config/zabbix-mysql/my.cnf << 'EOF'
[mysqld]
server-id = 100
bind-address = 0.0.0.0
port = 3306
datadir = /var/lib/mysql
socket = /var/lib/mysql/mysql.sock
log_bin = mysql-bin
binlog_format = ROW
expire_logs_days = 7
character-set-server = utf8mb4
collation-server = utf8mb4_bin
max_connections = 200
max_allowed_packet = 64M
innodb_buffer_pool_size = 256M
innodb_log_file_size = 64M
innodb_flush_log_at_trx_commit = 2
innodb_flush_method = O_DIRECT
[client]
socket = /var/lib/mysql/mysql.sock
[mysql]
socket = /var/lib/mysql/mysql.sock
EOF
步骤3:创建Zabbix Server配置
在Node3执行:
cat > /opt/cluster-deploy/config/zabbix/zabbix_server.conf << 'EOF'
ListenPort=10051
LogType=console
DBHost=zabbix-mysql
DBPort=3306
DBName=zabbix
DBUser=zabbix
DBPassword=ZabbixStr0ng!Pass
HANodeName=ZabbixServer
NodeAddress=172.20.5.31:10051
EOF
步骤4:创建Zabbix Agent配置
在所有节点执行对应的配置:
Node1 Agent配置
cat > /opt/cluster-deploy/config/zabbix/zabbix_agentd-node1.conf << 'EOF'
Server=172.20.5.31
ServerActive=172.20.5.31
Hostname=Node1-Agent
BufferSend=5
BufferSize=100
MaxLinesPerSecond=20
Timeout=10
LogType=console
EOF
Node2 Agent配置
cat > /opt/cluster-deploy/config/zabbix/zabbix_agentd-node2.conf << 'EOF'
Server=172.20.5.31
ServerActive=172.20.5.31
Hostname=Node2-Agent
BufferSend=5
BufferSize=100
MaxLinesPerSecond=20
Timeout=10
LogType=console
EOF
Node3 Agent配置
cat > /opt/cluster-deploy/config/zabbix/zabbix_agentd-node3.conf << 'EOF'
Server=172.20.5.31
ServerActive=172.20.5.31
Hostname=Node3-Agent
BufferSend=5
BufferSize=100
MaxLinesPerSecond=20
Timeout=10
LogType=console
EOF
步骤5:创建Docker Compose文件
Node1 Zabbix Agent
cat > /opt/cluster-deploy/docker-compose-zabbix-node1.yml << 'EOF'
services:
zabbix-agent:
image: zabbix/zabbix-agent2:alpine-7.0-latest
container_name: zabbix-agent
networks:
manage-net:
ipv4_address: 172.20.5.41
volumes:
- ./config/zabbix/zabbix_agentd-node1.conf:/etc/zabbix/zabbix_agent2.conf:ro
- /var/run/docker.sock:/var/run/docker.sock:ro
environment:
- ZABBIX_SERVER_HOST=172.20.5.31
restart: unless-stopped
networks:
manage-net:
external: true
EOF
Node2 Zabbix Agent
cat > /opt/cluster-deploy/docker-compose-zabbix-node2.yml << 'EOF'
services:
zabbix-agent:
image: zabbix/zabbix-agent2:alpine-7.0-latest
container_name: zabbix-agent
networks:
manage-net:
ipv4_address: 172.20.5.42
volumes:
- ./config/zabbix/zabbix_agentd-node2.conf:/etc/zabbix/zabbix_agent2.conf:ro
- /var/run/docker.sock:/var/run/docker.sock:ro
environment:
- ZABBIX_SERVER_HOST=172.20.5.31
restart: unless-stopped
networks:
manage-net:
external: true
EOF
Node3 Zabbix Server + Web + Agent
cat > /opt/cluster-deploy/docker-compose-zabbix-node3.yml << 'EOF'
services:
zabbix-agent:
image: zabbix/zabbix-agent2:alpine-7.0-latest
container_name: zabbix-agent
networks:
manage-net:
ipv4_address: 172.20.5.43
volumes:
- ./config/zabbix/zabbix_agentd-node3.conf:/etc/zabbix/zabbix_agent2.conf:ro
- /var/run/docker.sock:/var/run/docker.sock:ro
environment:
- ZABBIX_SERVER_HOST=172.20.5.31
restart: unless-stopped
zabbix-mysql:
image: mysql:8.0
container_name: zabbix-mysql
hostname: zabbix-mysql
networks:
manage-net:
ipv4_address: 172.20.5.33
volumes:
- zabbix-mysql-data:/var/lib/mysql
- ./config/zabbix-mysql/my.cnf:/etc/mysql/conf.d/my.cnf:ro
environment:
- MYSQL_ROOT_PASSWORD=RootStr0ng!Pass
- MYSQL_DATABASE=zabbix
- MYSQL_USER=zabbix
- MYSQL_PASSWORD=ZabbixStr0ng!Pass
command:
- --default-authentication-plugin=mysql_native_password
restart: unless-stopped
zabbix-server:
image: zabbix/zabbix-server-mysql:alpine-7.0-latest
container_name: zabbix-server
hostname: zabbix-server
networks:
manage-net:
ipv4_address: 172.20.5.31
volumes:
- zabbix-server-data:/var/lib/zabbix
- ./config/zabbix/zabbix_server.conf:/etc/zabbix/zabbix_server.conf:ro
environment:
- DB_SERVER_HOST=zabbix-mysql
- MYSQL_DATABASE=zabbix
- MYSQL_USER=zabbix
- MYSQL_PASSWORD=ZabbixStr0ng!Pass
- ZBX_CACHESIZE=128M
- ZBX_HISTORYCACHESIZE=64M
- ZBX_TRENDCACHESIZE=32M
- ZBX_VALUECACHESIZE=64M
ports:
- "10051:10051"
depends_on:
- zabbix-mysql
restart: unless-stopped
zabbix-web:
image: zabbix/zabbix-web-apache-mysql:alpine-7.0-latest
container_name: zabbix-web
hostname: zabbix-web
networks:
manage-net:
ipv4_address: 172.20.5.32
volumes:
- zabbix-web-data:/etc/zabbix/web
- zabbix-web-logs:/var/log/httpd
environment:
- DB_SERVER_HOST=zabbix-mysql
- MYSQL_DATABASE=zabbix
- MYSQL_USER=zabbix
- MYSQL_PASSWORD=ZabbixStr0ng!Pass
- ZBX_SERVER_HOST=172.20.5.31
- PHP_TZ=Asia/Shanghai
ports:
- "8080:8080"
depends_on:
- zabbix-mysql
- zabbix-server
restart: unless-stopped
networks:
manage-net:
external: true
volumes:
zabbix-mysql-data:
zabbix-server-data:
zabbix-web-data:
zabbix-web-logs:
EOF
步骤6:启动Zabbix服务
# Node1 - 启动Agent
cd /opt/cluster-deploy
docker compose -f docker-compose-zabbix-node1.yml up -d
# Node2 - 启动Agent
cd /opt/cluster-deploy
docker compose -f docker-compose-zabbix-node2.yml up -d
# Node3 - 启动Server + Web + Agent
cd /opt/cluster-deploy
docker compose -f docker-compose-zabbix-node3.yml up -d
初始化Zabbix Web界面
首次访问
-
打开浏览器访问:http://192.168.64.130:8080
-
默认登录信息:
-
用户名:Admin
-
密码:zabbix
-
初始配置向导
-
欢迎:点击 "Next step"
-
检查依赖:确认所有检查项通过,点击 "Next step"
-
配置数据库:保持默认设置,点击 "Next step"
-
Zabbix服务器详情:
-
Host:
172.20.5.31 -
Port:
10051 -
Name:
Zabbix server
-
-
时区配置 :选择
Asia/Shanghai -
完成:点击 "Finish"
添加主机监控
添加Node1主机
-
进入 Configuration → Hosts → Create host
-
填写主机信息:
-
Host name:
Node1 -
Groups: 选择
Linux servers -
Interfaces:
-
Type:
Agent -
IP address:
172.20.5.41 -
Port:
10050
-
-
-
点击 Templates:
-
Link new templates:
Linux by Zabbix agent -
Link new templates:
Docker(如需要)
-
-
点击 Add
添加Node2主机
同上,Host name为Node2,IP为172.20.5.42
添加Node3主机
同上,Host name为Node3,IP为172.20.5.43
配置监控项
常用监控项
| 监控项 | 键值 | 说明 |
|---|---|---|
| CPU使用率 | system.cpu.util | CPU总使用率 |
| 内存使用 | vm.memory.size | 内存总量 |
| 磁盘使用 | vfs.fs.size | 磁盘使用情况 |
| 网络流量 | net.if.in/out | 网络接口流量 |
| 容器数量 | docker.info | Docker容器数量 |
添加自定义监控项
-
进入 Configuration → Hosts
-
点击主机名称进入详情
-
点击 Items → Create item
-
填写监控项信息:
-
Name:
容器总数 -
Type:
Zabbix agent -
Key:
docker.info -
Type of information:
Numeric (unsigned)
-
配置告警
创建媒体类型
-
进入 Administration → Media types
-
点击 Email
-
配置SMTP服务器信息:
-
SMTP server:
smtp.example.com -
SMTP server port:
587 -
SMTP helo:
zabbix -
SMTP email:
zabbix@example.com
-
-
点击 Update
创建触发器
-
进入 Configuration → Hosts
-
点击触发器的主机
-
点击 Triggers → Create trigger
-
填写触发器信息:
-
Name:
CPU使用率过高 -
Severity:
Warning -
Expression:
{Node1:system.cpu.util.last()}>80
-
-
点击 Add
创建动作
-
进入 Configuration → Actions
-
选择 Trigger actions → Create action
-
填写动作信息:
-
Name:
CPU告警通知 -
Conditions:
Trigger = CPU使用率过高
-
-
点击 Operations:
-
Operation type:
Send message -
Send to: 选择用户组
-
Media type:
Email
-
-
点击 Add
使用Zabbix监控Docker
启用Docker监控模板
Zabbix Agent2内置了Docker监控支持。需要配置以下内容:
- 在Agent配置中添加监控插件:
cat >> /opt/cluster-deploy/config/zabbix/zabbix_agentd-node1.conf << 'EOF'
# Docker监控
Plugins.Docker.Endpoint=unix:///var/run/docker.sock
EOF
- 重启Agent:
docker restart zabbix-agent
-
在Zabbix Web中导入Docker模板:
-
下载模板:https://git.zabbix.com/projects/ZBX/repos/zabbix/raw/templates/app/docker.yaml
-
进入 Configuration → Templates → Import
-
选择下载的yaml文件
-
点击 Import
-
监控容器状态
可用的Docker监控项:
-
docker.container_info:容器信息 -
docker.container_stats:容器统计 -
docker.container.list:容器列表 -
docker.image.list:镜像列表
验证监控
检查主机状态
-
进入 Monitoring → Hosts
-
确认所有3个节点的ZBX图标为绿色
查看最新数据
-
进入 Monitoring → Latest data
-
选择主机查看监控数据
查看图表
-
进入 Monitoring → Graphs
-
选择主机和监控项查看趋势图
常用查询
查看容器数量
docker exec zabbix-agent zabbix_agent2 -t docker.container.list
查看系统负载
docker exec zabbix-agent zabbix_agent2 -t system.cpu.load
查看内存使用
docker exec zabbix-agent zabbix_agent2 -t vm.memory.size
故障排除
Zabbix Server无法启动
# 查看日志
docker logs zabbix-server
# 检查MySQL连接
docker exec zabbix-server nc -zv zabbix-mysql 3306
Agent无法连接Server
# 查看Agent日志
docker logs zabbix-agent
# 测试连通性
docker exec zabbix-agent zabbix_agent2 -t agent.ping
数据库初始化失败
首次启动时,Zabbix会自动初始化数据库。如果失败:
# 删除数据库卷重新初始化
docker compose down -v
docker compose up -d
Web界面显示"Services are not running"
# 检查Zabbix Server进程
docker exec zabbix-server pgrep -a zabbix_server
# 重启服务
docker restart zabbix-server zabbix-web
性能优化
调整Housekeeper设置
# 编辑zabbix_server.conf
HousekeepingFrequency=4
MaxHousekeeperDelete=5000
配置数据存储周期
-
进入 Administration → General → Housekeeper
-
调整历史数据和趋势的保留天数
调整缓存大小
# 编辑zabbix_server.conf
CacheSize=64M
StartPollers=10
StartPollersUnreachable=5
备份与恢复
备份Zabbix数据
# 备份MySQL数据
docker exec zabbix-mysql mysqldump -uroot -pRootStr0ng!Pass zabbix > zabbix_backup.sql
# 备份配置文件
tar czf zabbix_config_backup.tar.gz /opt/cluster-deploy/config/zabbix
恢复Zabbix数据
# 恢复MySQL数据
docker exec -i zabbix-mysql mysql -uroot -pRootStr0ng!Pass zabbix < zabbix_backup.sql
# 恢复配置文件
tar xzf zabbix_config_backup.tar.gz -C /