目录
[1. Kmonitor拆包安装部署. 3](#1. Kmonitor拆包安装部署. 3)
[1.1.环境准备. 3](#1.1.环境准备. 3)
[1.2.拷贝并解压. 4](#1.2.拷贝并解压. 4)
[1.3.kadb_exporter 4](#1.3.kadb_exporter 4)
[1.3.1 修改application.yml文件. 4](#1.3.1 修改application.yml文件. 4)
[1.3.2 修改连接池. 5](#1.3.2 修改连接池. 5)
[1.3.3 修改启动文件(可选) 6](#1.3.3 修改启动文件(可选) 6)
[1.4.H2库. 7](#1.4.H2库. 7)
[1.4.1进入h2db并修改启动文件(可选) 7](#1.4.1进入h2db并修改启动文件(可选) 7)
[1.4.2 打开h2库网址并连接. 8](#1.4.2 打开h2库网址并连接. 8)
[1.4.3 启动kadb_exporter 8](#1.4.3 启动kadb_exporter 8)
[1.5.node_exporter 8](#1.5.node_exporter 8)
[1.6.Prometheus 10](#1.6.Prometheus 10)
[1.6.1 node_conf 10](#1.6.1 node_conf 10)
[1.6.2 修改prometheus.yml配置文件并启动. 14](#1.6.2 修改prometheus.yml配置文件并启动. 14)
[1.6.3 打开prometheus网址并查看探针状态. 16](#1.6.3 打开prometheus网址并查看探针状态. 16)
[1.7.Grafana 16](#1.7.Grafana 16)
[1.7.1 插件所在目录(可选). 16](#1.7.1 插件所在目录(可选). 16)
[1.7.2 启动grafana面板. 17](#1.7.2 启动grafana面板. 17)
[1.7.3 打开grafana面板并查看状态. 17](#1.7.3 打开grafana面板并查看状态. 17)
[2. 注意事项. 20](#2. 注意事项. 20)
[2.1.h2库. 20](#2.1.h2库. 20)
[2.2.node_exporter 20](#2.2.node_exporter 20)
[2.3.kadb_exporter 20](#2.3.kadb_exporter 20)
[2.4.KADB 20](#2.4.KADB 20)
[2.5.grafana面板参数. 20](#2.5.grafana面板参数. 20)
-
Kmonitor拆包安装部署
-
环境准备
操作系统:centos7+
集群主机名称和IP地址对应关系:
|---------------|----------|--------------|----------|-----------|
| IP地址(内网) | 内网网卡 | IP地址(外网) | 外网网卡 | 主机名称 |
| 172.18.35.208 | bondib0 | 10.1.35.208 | bond0 | dwabamg01 |
| 172.18.35.209 | bondib0 | 10.1.35.209 | bond0 | dwabamg02 |
| 172.18.35.211 | bondib0 | 10.1.35.211 | bond0 | dwabasg01 |
| 172.18.35.212 | bondib0 | 10.1.35.212 | bond0 | dwabasg02 |
| 172.18.35.213 | bondib0 | 10.1.35.213 | bond0 | dwabasg03 |
| 172.18.35.214 | bondib0 | 10.1.35.214 | bond0 | dwabasg04 |
| 172.18.35.215 | bondib0 | 10.1.35.215 | bond0 | dwabasg05 |
| 172.18.35.216 | bondib0 | 10.1.35.216 | bond0 | dwabasg06 |
| 172.18.35.217 | bondib0 | 10.1.35.217 | bond0 | dwabasg07 |
| 172.18.35.218 | bondib0 | 10.1.35.218 | bond0 | dwabasg08 |
| 浮动IP:10.1.35.210 |||||
| 网关地址:10.1.35.254 |||||
数据库:postgres
集群用户名:xinjiang
密码:统一为123456,如需更改,需要修改编译之后的密码
在集群所有节点上创建操作系统用户:xinjiang,使用root用户在下面的机器上执行:
|-----------|------------------------------|
| 主机名称 | 执行操作 |
| dwabamg01 | useradd -g mppadmin xinjiang |
| dwabamg02 | useradd -g mppadmin xinjiang |
| dwabasg01 | useradd -g mppadmin xinjiang |
| dwabasg02 | useradd -g mppadmin xinjiang |
| dwabasg03 | useradd -g mppadmin xinjiang |
| dwabasg04 | useradd -g mppadmin xinjiang |
| dwabasg05 | useradd -g mppadmin xinjiang |
| dwabasg06 | useradd -g mppadmin xinjiang |
| dwabasg07 | useradd -g mppadmin xinjiang |
| dwabasg08 | useradd -g mppadmin xinjiang |
- 拷贝并解压
拷贝centos7_amd64.tar.gz到集群节点10.1.35.209的用户xinjiang的home目录下并解压,必须在监控的集群用户下,root用户执行
su - xinjiang
tar -xvf centos7_amd64.tar.gz
- kadb_exporter
1.3.1 修改application.yml文件
使用xinjiang用户编辑/home/xinjiang/centos7_amd64/kadb_exporter文件
xinjiang@dwabamg02 kadb_exporter\]$ pwd
/home/xinjiang/centos7_amd64/kadb_exporter
\[xinjiang@dwabamg02 kadb_exporter\]$ vi application.yml
spring:
profiles:
active: development
server:
port: 10000 #kadb_exporter端口号
---
spring:
profiles: development
datasource:
url:jdbc:h2:tcp://10.1.35.209:10002//home/xinjiang/centos7_amd64/h2db/data/operator
#h2库url串
username: root
password: ENC(rVBkqsNjKhfSrkZazEoIQzMUlEejr6qNfP6U8m66JS2nSupFuAJpeRReWH_w_y39eGI6pZwbKenptjFxD4KJiuTIncrK2h3mBCiaTOTQHKX32rXD6NrW1gmGSVmE0blBXOdLZZkEfTVEWOAHR8IdldVCkK8anzOEC7em68qJZ98) #这里的密码是h2库连接的root密码是123456
hikari:
pool-name: default
connection-test-query: select current_timestamp;
minimum-idle: 2
maximum-pool-size: 10
#### 1.3.2 修改连接池
使用xinjiang用户编辑/home/xinjiang/centos7_amd64/kadb_exporter/conf文件
\[xinjiang@dwabamg02 conf\]$ pwd
/home/xinjiang/centos7_amd64/kadb_exporter/conf
\[xinjiang@dwabamg02 conf\]$ vi jdbc_pool_default.xml
\
\
\
},
{
"labels": {
"desc": "集群master节点",
"project": "新疆农信",
"system": " xinjiang ",
"instance": "10.1.35.208",
"hostname": " dwabamg01",
"cluster": "kadb_cluster",
"service": "node_exporter"
},
"targets": [
"10.1.35.208:10003"
]
},
{
"labels": {
"desc": "集群计算节点1",
"project": "新疆农信",
"system": " xinjiang ",
"instance": "10.1.35.211",
"hostname": "dwabasg01",
"cluster": "kadb_cluster",
"service": "node_exporter"
},
"targets": [
"10.1.35.211:10003"
]
},
{
"labels": {
"desc": "集群计算节点2",
"project": "新疆农信",
"system": " xinjiang ",
"instance": "10.1.35.212",
"hostname": "dwabasg02",
"cluster": "kadb_cluster",
"service": "node_exporter"
},
"targets": [
"10.1.35.212:10003"
]
},
{
"labels": {
"desc": "集群计算节点3",
"project": "新疆农信",
"system": " xinjiang ",
"instance": "10.1.35.212",
"hostname": "dwabasg03",
"cluster": "kadb_cluster",
"service": "node_exporter"
},
"targets": [
"10.1.35.213:10003"
]
},
{
"labels": {
"desc": "集群计算节点4",
"project": "新疆农信",
"system": " xinjiang ",
"instance": "10.1.35.214",
"hostname": "dwabasg04",
"cluster": "kadb_cluster",
"service": "node_exporter"
},
"targets": [
"10.1.35.214:10003"
]
},
{
"labels": {
"desc": "集群计算节点5",
"project": "新疆农信",
"system": " xinjiang ",
"instance": "10.1.35.215",
"hostname": "dwabasg05",
"cluster": "kadb_cluster",
"service": "node_exporter"
},
"targets": [
"10.1.35.215:10003"
]
},
{
"labels": {
"desc": "集群计算节点6",
"project": "新疆农信",
"system": " xinjiang ",
"instance": "10.1.35.216",
"hostname": "dwabasg06",
"cluster": "kadb_cluster",
"service": "node_exporter"
},
"targets": [
"10.1.35.216:10003"
]
},
{
"labels": {
"desc": "集群计算节点7",
"project": "新疆农信",
"system": " xinjiang ",
"instance": "10.1.35.217",
"hostname": "dwabasg07",
"cluster": "kadb_cluster",
"service": "node_exporter"
},
"targets": [
"10.1.35.217:10003"
]
},
{
"labels": {
"desc": "集群计算节点8",
"project": "新疆农信",
"system": " xinjiang ",
"instance": "10.1.35.218",
"hostname": "dwabasg08",
"cluster": "kadb_cluster",
"service": "node_exporter"
},
"targets": [
"10.1.35.218:10003"
]
}
]
1.6.2 修改prometheus.yml配置文件并启动
xinjiang@dwabamg02 prometheus\]$ pwd /home/xinjiang/centos7_amd64/prometheus \[xinjiang@dwabamg02 prometheus\]$ vi prometheus.yml global: scrape_interval: 10s evaluation_interval: 10s scrape_configs: - job_name: 'consul' static_configs: - targets: \['10.1.35.209:10000'
labels:
cluster: 'kadb_cluster'
service: 'kadb_exporter'
kadburl: 'http://192.168.0.30:10000'
- targets: ['10.1.35.209:10003']
labels:
cluster: 'kadb_cluster'
service: 'node_exporter'
kadburl: 'http://10.1.35.209:10003'
- targets: ['10.1.35.208:10003']
labels:
cluster: 'kadb_cluster'
service: 'node_exporter'
kadburl: 'http:// 10.1.35.208:10003'
- targets: ['10.1.35.211:10003']
labels:
cluster: 'kadb_cluster'
service: 'node_exporter'
kadburl: 'http:// 10.1.35.211:10003'
- targets: ['10.1.35.212:10003']
labels:
cluster: 'kadb_cluster'
service: 'node_exporter'
kadburl: 'http://10.1.35.212:10003'
- targets: ['10.1.35.213:10003']
labels:
cluster: 'kadb_cluster'
service: 'node_exporter'
kadburl: 'http://10.1.35.213:10003'
- targets: ['10.1.35.214:10003']
labels:
cluster: 'kadb_cluster'
service: 'node_exporter'
kadburl: 'http://10.1.35.214:10003'
- targets: ['10.1.35.215:10003']
labels:
cluster: 'kadb_cluster'
service: 'node_exporter'
kadburl: 'http://10.1.35.215:10003'
- targets: ['10.1.35.216:10003']
labels:
cluster: 'kadb_cluster'
service: 'node_exporter'
kadburl: 'http:/10.1.35.216:10003'
- targets: ['10.1.35.217:10003']
labels:
cluster: 'kadb_cluster'
service: 'node_exporter'
kadburl: 'http://10.1.35.217:10003'
- targets: ['10.1.35.218:10003']
labels:
cluster: 'kadb_cluster'
service: 'node_exporter'
kadburl: 'http://10.1.35.218:10003'
file_sd_configs:
-
files:
-
/home/xinjiang/centos7_amd64/prometheus/node_conf/node_kadb_info.json
xinjiang@mpp170 prometheus\]$ sh start.sh #启动prometheus #### 1.6.3 打开prometheus网址并查看探针状态  图例显示所以的探针处于"UP"状态为正常 1. Grafana #### 1.7.1 插件所在目录(可选) *无新增监控面板,可不做修改* \[xinjiang@dwabamg02 plugins\]$ pwd /home/xinjiang/centos7_amd64/kadb_monitor/plugins \[xinjiang@dwabamg02 kadb_monitor\]$ cd plugins/ \[xinjiang@mpp170 plugins\]$ ll 总用量 20 drwxrwxr-x. 3 xinjiang xinjiang 70 7月 5 16:01 AlertManagerPanel drwxrwxr-x. 3 xinjiang xinjiang 38 10月 20 16:09 ClusterVersionStatusPanel drwxrwxr-x. 3 xinjiang xinjiang 4096 7月 5 16:01 data-table-plugin drwxrwxr-x. 3 xinjiang xinjiang 70 7月 5 16:01 InstanceStatusPanel drwxrwxr-x. 3 xinjiang xinjiang 98 10月 9 09:44 kadb_AlertNowList_plugin drwxrwxr-x. 3 xinjiang xinjiang 98 9月 24 11:35 kadb_ClusterInfoTable_plugin drwxrwxr-x. 3 xinjiang xinjiang 4096 7月 5 16:01 kadb_piechart_panel drwxrwxr-x. 3 xinjiang xinjiang 4096 10月 9 11:15 kadb_TopologyTable_plugin drwxrwxr-x. 3 xinjiang xinjiang 4096 7月 5 16:01 selected-table-plugin drwxrwxr-x. 3 xinjiang xinjiang 70 7月 5 16:01 SessionListPanel drwxrwxr-x. 3 xinjiang xinjiang 4096 7月 5 16:01 topology-plugin #集群版本和拓扑信息显示,插件为:kadb_AlertNowList_plugin、ClusterVersionStatusPanel、kadb_TopologyTable_plugin #### 1.7.2 启动grafana面板 \[xinjiang@dwabamg02 kadb_monitor\]$ pwd /home/xinjiang/centos7_amd64/kadb_monitor \[xinjiang@mpp170 kadb_monitor\]$ sh start.sh #### 1.7.3 打开grafana面板并查看状态 浏览器地址:http://10.1.35.208:3000  建议使用谷歌浏览器,如果打开grafana是可以的,则不需要修改参数,参数修改视情况而定 如果集群拓扑信息不能正常显示,则单击图中"下箭头",选择"编辑"  在面板右边"可视化"菜单中,将IP地址修改为:10.1.35.209。 注意:这里的ip地址为kadb_export的地址和端口(10000),如果是内网地址,需要将kadb_export的地址映射为浏览器的外网地址后,进行修改。B端需要直接和kadb_export进行通讯  "最新警报"和"集群版本"两个监控面板也做同样的修改  如果节点资源信息信息不能正常显示,表现为监控界面打开缓慢,有如下报错,并且kadb_monitor.log日志文件有报错,不能连接192.168.0.30:10004,如图  则是kmonitor没有正确设置prometheus的地址信息,需要在上面左侧图中选择  齿轮,配置实际的prometheus地址 能curl到kadb_export:10000地址的下面信息,说明浏览器和kadb_export通讯正常  监控面板编辑页面:  节点监控页面:  主机监控页面:  1. 注意事项 ### 2.1.h2库 h2一定要起来,h2库网址连接正常 ### 2.2.node_exporter 每个监控的ip下要有一个node_exporter目录并启动 ### 2.3.kadb_exporter 一个集群要有一个kadb_exporter,在主节点上,并且一个kadb_exporter至少占实际内存2.5G ### 2.4.KADB Kadb数据库要正常启动 ### 2.6. 修改默认参数  选择这里的设置  选择这里的变量 修改:request_url为prometheus节点地址 ###  ### 2.7 频繁扫描日志造成磁盘I/O繁忙 修改kadb_export的配置文件:/home/mppadmin/centos7_amd64/kadb_exporter/conf/schedules.xml 将以下部分删除 日志抓取调度:  和磁盘数据分布相关的调度:  