目录
[1. Kmonitor拆包安装部署. 3](#1. Kmonitor拆包安装部署. 3)
[1.1.环境准备. 3](#1.1.环境准备. 3)
[1.2.拷贝并解压. 4](#1.2.拷贝并解压. 4)
[1.3.kadb_exporter 4](#1.3.kadb_exporter 4)
[1.3.1 修改application.yml文件. 4](#1.3.1 修改application.yml文件. 4)
[1.3.2 修改连接池. 5](#1.3.2 修改连接池. 5)
[1.3.3 修改启动文件(可选) 6](#1.3.3 修改启动文件(可选) 6)
[1.4.H2库. 7](#1.4.H2库. 7)
[1.4.1进入h2db并修改启动文件(可选) 7](#1.4.1进入h2db并修改启动文件(可选) 7)
[1.4.2 打开h2库网址并连接. 8](#1.4.2 打开h2库网址并连接. 8)
[1.4.3 启动kadb_exporter 8](#1.4.3 启动kadb_exporter 8)
[1.5.node_exporter 8](#1.5.node_exporter 8)
[1.6.Prometheus 10](#1.6.Prometheus 10)
[1.6.1 node_conf 10](#1.6.1 node_conf 10)
[1.6.2 修改prometheus.yml配置文件并启动. 14](#1.6.2 修改prometheus.yml配置文件并启动. 14)
[1.6.3 打开prometheus网址并查看探针状态. 16](#1.6.3 打开prometheus网址并查看探针状态. 16)
[1.7.Grafana 16](#1.7.Grafana 16)
[1.7.1 插件所在目录(可选). 16](#1.7.1 插件所在目录(可选). 16)
[1.7.2 启动grafana面板. 17](#1.7.2 启动grafana面板. 17)
[1.7.3 打开grafana面板并查看状态. 17](#1.7.3 打开grafana面板并查看状态. 17)
[2. 注意事项. 20](#2. 注意事项. 20)
[2.1.h2库. 20](#2.1.h2库. 20)
[2.2.node_exporter 20](#2.2.node_exporter 20)
[2.3.kadb_exporter 20](#2.3.kadb_exporter 20)
[2.4.KADB 20](#2.4.KADB 20)
[2.5.grafana面板参数. 20](#2.5.grafana面板参数. 20)
-
Kmonitor拆包安装部署
-
环境准备
操作系统:centos7+
集群主机名称和IP地址对应关系:
|---------------|----------|--------------|----------|-----------|
| IP地址(内网) | 内网网卡 | IP地址(外网) | 外网网卡 | 主机名称 |
| 172.18.35.208 | bondib0 | 10.1.35.208 | bond0 | dwabamg01 |
| 172.18.35.209 | bondib0 | 10.1.35.209 | bond0 | dwabamg02 |
| 172.18.35.211 | bondib0 | 10.1.35.211 | bond0 | dwabasg01 |
| 172.18.35.212 | bondib0 | 10.1.35.212 | bond0 | dwabasg02 |
| 172.18.35.213 | bondib0 | 10.1.35.213 | bond0 | dwabasg03 |
| 172.18.35.214 | bondib0 | 10.1.35.214 | bond0 | dwabasg04 |
| 172.18.35.215 | bondib0 | 10.1.35.215 | bond0 | dwabasg05 |
| 172.18.35.216 | bondib0 | 10.1.35.216 | bond0 | dwabasg06 |
| 172.18.35.217 | bondib0 | 10.1.35.217 | bond0 | dwabasg07 |
| 172.18.35.218 | bondib0 | 10.1.35.218 | bond0 | dwabasg08 |
| 浮动IP:10.1.35.210 |||||
| 网关地址:10.1.35.254 |||||
数据库:postgres
集群用户名:xinjiang
密码:统一为123456,如需更改,需要修改编译之后的密码
在集群所有节点上创建操作系统用户:xinjiang,使用root用户在下面的机器上执行:
|-----------|------------------------------|
| 主机名称 | 执行操作 |
| dwabamg01 | useradd -g mppadmin xinjiang |
| dwabamg02 | useradd -g mppadmin xinjiang |
| dwabasg01 | useradd -g mppadmin xinjiang |
| dwabasg02 | useradd -g mppadmin xinjiang |
| dwabasg03 | useradd -g mppadmin xinjiang |
| dwabasg04 | useradd -g mppadmin xinjiang |
| dwabasg05 | useradd -g mppadmin xinjiang |
| dwabasg06 | useradd -g mppadmin xinjiang |
| dwabasg07 | useradd -g mppadmin xinjiang |
| dwabasg08 | useradd -g mppadmin xinjiang |
- 拷贝并解压
拷贝centos7_amd64.tar.gz到集群节点10.1.35.209的用户xinjiang的home目录下并解压,必须在监控的集群用户下,root用户执行
su - xinjiang
tar -xvf centos7_amd64.tar.gz
- kadb_exporter
1.3.1 修改application.yml文件
使用xinjiang用户编辑/home/xinjiang/centos7_amd64/kadb_exporter文件
[xinjiang@dwabamg02 kadb_exporter]$ pwd
/home/xinjiang/centos7_amd64/kadb_exporter
[xinjiang@dwabamg02 kadb_exporter]$ vi application.yml
spring:
profiles:
active: development
server:
port: 10000 #kadb_exporter端口号
spring:
profiles: development
datasource:
url:jdbc:h2:tcp://10.1.35.209:10002//home/xinjiang/centos7_amd64/h2db/data/operator
#h2库url串
username: root
password: ENC(rVBkqsNjKhfSrkZazEoIQzMUlEejr6qNfP6U8m66JS2nSupFuAJpeRReWH_w_y39eGI6pZwbKenptjFxD4KJiuTIncrK2h3mBCiaTOTQHKX32rXD6NrW1gmGSVmE0blBXOdLZZkEfTVEWOAHR8IdldVCkK8anzOEC7em68qJZ98) #这里的密码是h2库连接的root密码是123456
hikari:
pool-name: default
connection-test-query: select current_timestamp;
minimum-idle: 2
maximum-pool-size: 10
1.3.2 修改连接池
使用xinjiang用户编辑/home/xinjiang/centos7_amd64/kadb_exporter/conf文件
[xinjiang@dwabamg02 conf]$ pwd
/home/xinjiang/centos7_amd64/kadb_exporter/conf
[xinjiang@dwabamg02 conf]$ vi jdbc_pool_default.xml
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<properties>
<comment>动态数据源池中默认JDBC连接池配置参数</comment>
<entry key="clusterName"><![CDATA[KADB_CLUSTER]]></entry>
<entry key="driverClass"><![CDATA[org.postgresql.Driver]]></entry>
<entry key="jdbcUrl"><![CDATA[jdbc:postgresql://10.1.35.209:5888/postgres]]></entry>
#数据库访问url,格式:jdbc:postgresql://ip:port/db_name
<entry key="database"><![CDATA[postgres]]></entry>
<entry key="username"><![CDATA[xinjiang]]></entry>
<entry key="password"><![CDATA[rVBkqsNjKhfSrkZazEoIQzMUlEejr6qNfP6U8m66JS2nSupFuAJpeRReWH_w_y39eGI6pZwbKenptjFxD4KJiuTIncrK2h3mBCiaTOTQHKX32rXD6NrW1gmGSVmE0blBXOdLZZkEfTVEWOAHR8IdldVCkK8anzOEC7em68qJZ98]]></entry>
gzkadb@sx0sxrf
#数据库用户密码为加密密码123456,获取加密密码使用的脚本为:
#./pass_enc.sh --prikey conf/conf_encrypt.pri --dbpass $PASSWORD
<entry key="minimumIdle"><![CDATA[2]]></entry>
<entry key="maximumPoolSize"><![CDATA[4]]></entry>
<entry key="testQuery"><![CDATA[select current_timestamp]]></entry>
</properties>
1.3.3 修改启动文件(可选)
如果限制kadb_exporter进程使用的内存为2.5GB,执行下面的操作
使用xinjiang用户编辑/home/xinjiang/centos7_amd64/kadb_exporter/start.sh文件
[xinjiang@dwabamg02 kadb_exporter]$ vi start.sh
#!/bin/bash
source ~/.bashrc
if [ ! -d logs/prometheus ]; then
mkdir -p logs/prometheus
fi
Start PrometheusExporter log.
nohup python PrometheusExporter.py > /dev/null 2>&1 &
Start KADB monitor.
nohup java -Xmx2048m -Xms2048m -cp .:./lib/*:./conf/* cn.com.kingbase.kmonitor.kadb.KMonitor &
#内存限制为一个kadb_exporter2.5g的话,添加-Xmx2048m -Xms2048m限制
修改完毕source ~/.bashrc刷新
- H2库
1.4.1进入h2db并修改启动文件(可选)
如果限制h2db进程使用的内存,执行下面的操作,加入-Xmx2048m -Xms2048m,否则无需修改
使用xinjiang用户编辑/home/xinjiang/centos7_amd64/h2db/start.sh文件
[xinjiang@dwabamg02 h2db]$ pwd
/home/xinjiang/centos7_amd64/h2db
[xinjiang@mpp170 h2db]$ vi start.sh
#!/bin/sh
dir=$(dirname "$0")
nohup java -cp "dir/kadb_h2.jar:H2DRIVERS:CLASSPATH" org.h2.tools.Server -ifNotExists -tcpAllowOthers -webAllowOthers -webPort 10001 -tcpPort 10002 "@" &
#10002为h2库tcp连接端口
[xinjiang@dwabamg02 h2db]$ sh start.sh
[xinjiang@dwabamg02 h2db]$ cat nohup.out
TCP server running at tcp://192.168.0.30:10002 (others can connect)
PG server running at pg://192.168.0.30:5435 (only local connections)
Web Console server running at http://192.168.0.30:10001 (others can connect)
#http://192.168.0.30:10001为进入h2库的网址
1.4.2 打开h2库网址并连接
#账号/密码:root/123456
1.4.3 启动kadb_exporter
[xinjiang@dwabamg02 kadb_exporter]$ pwd
/home/xinjiang/centos7_amd64/kadb_exporter
[xinjiang@mpp170 kadb_exporter]$ sh start.sh
- node_exporter
分别在各个节点下xinjiang用户启动node_exporter
[xinjiang@dwabamg02 centos7_amd64]$ pwd
/home/xinjiang/centos7_amd64
******将node_exporter执行程序拷贝到集群每个节点的/home/xinjiang/目录下*********
[xinjiang@dwabamg02 centos7_amd64]$ scp node_exporter/ xinjiang@ dwabamg01:/home/xinjiang/
[xinjiang@dwabamg02 centos7_amd64]$ scp node_exporter/ xinjiang@ dwabasg01:/home/xinjiang/
[xinjiang@dwabamg02 centos7_amd64]$ scp node_exporter/ xinjiang@ dwabasg02:/home/xinjiang/
[xinjiang@dwabamg02 centos7_amd64]$ scp node_exporter/ xinjiang@ dwabasg03:/home/xinjiang/
[xinjiang@dwabamg02 centos7_amd64]$ scp node_exporter/ xinjiang@ dwabasg04:/home/xinjiang/
[xinjiang@dwabamg02 centos7_amd64]$ scp node_exporter/ xinjiang@ dwabasg05:/home/xinjiang/
[xinjiang@dwabamg02 centos7_amd64]$ scp node_exporter/ xinjiang@ dwabasg06:/home/xinjiang/
[xinjiang@dwabamg02 centos7_amd64]$ scp node_exporter/ xinjiang@ dwabasg07:/home/xinjiang/
[xinjiang@dwabamg02 centos7_amd64]$ scp node_exporter/ xinjiang@ dwabasg08:/home/xinjiang/
************分别登录到集群的每个节点,启动node_exporter******************
******** dwabamg01启动node_exporter**************
[xinjiang@dwabamg02 node_exporter]$ ssh dwabamg01
[xinjiang@dwabamg01 centos7_amd64]$ cd /home/xinjiang/node_exporter/
[xinjiang@dwabamg01 node_exporter]$ sh start.sh
[xinjiang@dwabamg01 node_exporter]$ exit
******** dwabasg01启动node_exporter**************
[xinjiang@dwabamg02 node_exporter]$ ssh dwabasg01
[xinjiang@dwabasg01 centos7_amd64]$ cd /home/xinjiang/node_exporter/
[xinjiang@dwabasg01 node_exporter]$ sh start.sh
[xinjiang@dwabasg01 node_exporter]$ exit
******** dwabasg02启动node_exporter**************
[xinjiang@dwabamg02 node_exporter]$ ssh dwabasg02
[xinjiang@dwabasg02 centos7_amd64]$ cd /home/xinjiang/node_exporter/
[xinjiang@dwabasg02 node_exporter]$ sh start.sh
[xinjiang@dwabasg02 node_exporter]$ exit
******** dwabasg03启动node_exporter**************
[xinjiang@dwabamg02 node_exporter]$ ssh dwabasg03
[xinjiang@dwabasg03 centos7_amd64]$ cd /home/xinjiang/node_exporter/
[xinjiang@dwabasg03 node_exporter]$ sh start.sh
[xinjiang@dwabasg03 node_exporter]$ exit
******** dwabasg04启动node_exporter**************
[xinjiang@dwabamg02 node_exporter]$ ssh dwabasg04
[xinjiang@dwabasg04 centos7_amd64]$ cd /home/xinjiang/node_exporter/
[xinjiang@dwabasg04 node_exporter]$ sh start.sh
[xinjiang@dwabasg04 node_exporter]$ exit
******** dwabasg05启动node_exporter**************
[xinjiang@dwabamg02 node_exporter]$ ssh dwabasg05
[xinjiang@dwabasg05 centos7_amd64]$ cd /home/xinjiang/node_exporter/
[xinjiang@dwabasg05 node_exporter]$ sh start.sh
[xinjiang@dwabasg05 node_exporter]$ exit
******** dwabasg06启动node_exporter**************
[xinjiang@dwabamg02 node_exporter]$ ssh dwabasg06
[xinjiang@dwabasg06 centos7_amd64]$ cd /home/xinjiang/node_exporter/
[xinjiang@dwabasg06 node_exporter]$ sh start.sh
[xinjiang@dwabasg06 node_exporter]$ exit
******** dwabasg07启动node_exporter**************
[xinjiang@dwabamg02 node_exporter]$ ssh dwabasg07
[xinjiang@dwabasg07 centos7_amd64]$ cd /home/xinjiang/node_exporter/
[xinjiang@dwabasg07 node_exporter]$ sh start.sh
[xinjiang@dwabasg07 node_exporter]$ exit
******** dwabasg08启动node_exporter**************
[xinjiang@dwabamg02 node_exporter]$ ssh dwabasg08
[xinjiang@dwabasg08 centos7_amd64]$ cd /home/xinjiang/node_exporter/
[xinjiang@dwabasg08 node_exporter]$ sh start.sh
[xinjiang@dwabasg08 node_exporter]$ exit
- Prometheus
1.6.1 node_conf
使用用户xinjiang新建并进入node_conf目录,创建node_kadb_info.json文件
[xinjiang@dwabamg02 prometheus]$ pwd
/home/xinjiang/centos7_amd64/prometheus
[xinjiang@dwabamg02 prometheus]$ mkdir -p node_conf
[xinjiang@dwabamg02 prometheus]$ cd node_conf/
[xinjiang@dwabamg02 node_conf]$ vim node_kadb_info.json
[
{
"labels": {
"desc": "集群standby节点",
"project": "新疆农信",
"system": "xinjiang",
"instance": "10.1.35.209",
"hostname": "dwabamg02",
"cluster": "kadb_cluster",
"service": "node_exporter"
},
"targets": [
"10.1.35.209:10003"
]
},
{
"labels": {
"desc": "集群master节点",
"project": "新疆农信",
"system": " xinjiang ",
"instance": "10.1.35.208",
"hostname": " dwabamg01",
"cluster": "kadb_cluster",
"service": "node_exporter"
},
"targets": [
"10.1.35.208:10003"
]
},
{
"labels": {
"desc": "集群计算节点1",
"project": "新疆农信",
"system": " xinjiang ",
"instance": "10.1.35.211",
"hostname": "dwabasg01",
"cluster": "kadb_cluster",
"service": "node_exporter"
},
"targets": [
"10.1.35.211:10003"
]
},
{
"labels": {
"desc": "集群计算节点2",
"project": "新疆农信",
"system": " xinjiang ",
"instance": "10.1.35.212",
"hostname": "dwabasg02",
"cluster": "kadb_cluster",
"service": "node_exporter"
},
"targets": [
"10.1.35.212:10003"
]
},
{
"labels": {
"desc": "集群计算节点3",
"project": "新疆农信",
"system": " xinjiang ",
"instance": "10.1.35.212",
"hostname": "dwabasg03",
"cluster": "kadb_cluster",
"service": "node_exporter"
},
"targets": [
"10.1.35.213:10003"
]
},
{
"labels": {
"desc": "集群计算节点4",
"project": "新疆农信",
"system": " xinjiang ",
"instance": "10.1.35.214",
"hostname": "dwabasg04",
"cluster": "kadb_cluster",
"service": "node_exporter"
},
"targets": [
"10.1.35.214:10003"
]
},
{
"labels": {
"desc": "集群计算节点5",
"project": "新疆农信",
"system": " xinjiang ",
"instance": "10.1.35.215",
"hostname": "dwabasg05",
"cluster": "kadb_cluster",
"service": "node_exporter"
},
"targets": [
"10.1.35.215:10003"
]
},
{
"labels": {
"desc": "集群计算节点6",
"project": "新疆农信",
"system": " xinjiang ",
"instance": "10.1.35.216",
"hostname": "dwabasg06",
"cluster": "kadb_cluster",
"service": "node_exporter"
},
"targets": [
"10.1.35.216:10003"
]
},
{
"labels": {
"desc": "集群计算节点7",
"project": "新疆农信",
"system": " xinjiang ",
"instance": "10.1.35.217",
"hostname": "dwabasg07",
"cluster": "kadb_cluster",
"service": "node_exporter"
},
"targets": [
"10.1.35.217:10003"
]
},
{
"labels": {
"desc": "集群计算节点8",
"project": "新疆农信",
"system": " xinjiang ",
"instance": "10.1.35.218",
"hostname": "dwabasg08",
"cluster": "kadb_cluster",
"service": "node_exporter"
},
"targets": [
"10.1.35.218:10003"
]
}
]
1.6.2 修改prometheus.yml配置文件并启动
[xinjiang@dwabamg02 prometheus]$ pwd
/home/xinjiang/centos7_amd64/prometheus
[xinjiang@dwabamg02 prometheus]$ vi prometheus.yml
global:
scrape_interval: 10s
evaluation_interval: 10s
scrape_configs:
- job_name: 'consul'
static_configs:
- targets: ['10.1.35.209:10000']
labels:
cluster: 'kadb_cluster'
service: 'kadb_exporter'
kadburl: 'http://192.168.0.30:10000'
- targets: ['10.1.35.209:10003']
labels:
cluster: 'kadb_cluster'
service: 'node_exporter'
kadburl: 'http://10.1.35.209:10003'
- targets: ['10.1.35.208:10003']
labels:
cluster: 'kadb_cluster'
service: 'node_exporter'
kadburl: 'http:// 10.1.35.208:10003'
- targets: ['10.1.35.211:10003']
labels:
cluster: 'kadb_cluster'
service: 'node_exporter'
kadburl: 'http:// 10.1.35.211:10003'
- targets: ['10.1.35.212:10003']
labels:
cluster: 'kadb_cluster'
service: 'node_exporter'
kadburl: 'http://10.1.35.212:10003'
- targets: ['10.1.35.213:10003']
labels:
cluster: 'kadb_cluster'
service: 'node_exporter'
kadburl: 'http://10.1.35.213:10003'
- targets: ['10.1.35.214:10003']
labels:
cluster: 'kadb_cluster'
service: 'node_exporter'
kadburl: 'http://10.1.35.214:10003'
- targets: ['10.1.35.215:10003']
labels:
cluster: 'kadb_cluster'
service: 'node_exporter'
kadburl: 'http://10.1.35.215:10003'
- targets: ['10.1.35.216:10003']
labels:
cluster: 'kadb_cluster'
service: 'node_exporter'
kadburl: 'http:/10.1.35.216:10003'
- targets: ['10.1.35.217:10003']
labels:
cluster: 'kadb_cluster'
service: 'node_exporter'
kadburl: 'http://10.1.35.217:10003'
- targets: ['10.1.35.218:10003']
labels:
cluster: 'kadb_cluster'
service: 'node_exporter'
kadburl: 'http://10.1.35.218:10003'
file_sd_configs:
-
files:
-
/home/xinjiang/centos7_amd64/prometheus/node_conf/node_kadb_info.json
[xinjiang@mpp170 prometheus]$ sh start.sh #启动prometheus
1.6.3 打开prometheus网址并查看探针状态
图例显示所以的探针处于"UP"状态为正常
- Grafana
1.7.1 插件所在目录(可选)
无新增监控面板,可不做修改
[xinjiang@dwabamg02 plugins]$ pwd
/home/xinjiang/centos7_amd64/kadb_monitor/plugins
[xinjiang@dwabamg02 kadb_monitor]$ cd plugins/
[xinjiang@mpp170 plugins]$ ll
总用量 20
drwxrwxr-x. 3 xinjiang xinjiang 70 7月 5 16:01 AlertManagerPanel
drwxrwxr-x. 3 xinjiang xinjiang 38 10月 20 16:09 ClusterVersionStatusPanel
drwxrwxr-x. 3 xinjiang xinjiang 4096 7月 5 16:01 data-table-plugin
drwxrwxr-x. 3 xinjiang xinjiang 70 7月 5 16:01 InstanceStatusPanel
drwxrwxr-x. 3 xinjiang xinjiang 98 10月 9 09:44 kadb_AlertNowList_plugin
drwxrwxr-x. 3 xinjiang xinjiang 98 9月 24 11:35 kadb_ClusterInfoTable_plugin
drwxrwxr-x. 3 xinjiang xinjiang 4096 7月 5 16:01 kadb_piechart_panel
drwxrwxr-x. 3 xinjiang xinjiang 4096 10月 9 11:15 kadb_TopologyTable_plugin
drwxrwxr-x. 3 xinjiang xinjiang 4096 7月 5 16:01 selected-table-plugin
drwxrwxr-x. 3 xinjiang xinjiang 70 7月 5 16:01 SessionListPanel
drwxrwxr-x. 3 xinjiang xinjiang 4096 7月 5 16:01 topology-plugin
#集群版本和拓扑信息显示,插件为:kadb_AlertNowList_plugin、ClusterVersionStatusPanel、kadb_TopologyTable_plugin
1.7.2 启动grafana面板
[xinjiang@dwabamg02 kadb_monitor]$ pwd
/home/xinjiang/centos7_amd64/kadb_monitor
[xinjiang@mpp170 kadb_monitor]$ sh start.sh
1.7.3 打开grafana面板并查看状态
浏览器地址:http://10.1.35.208:3000
建议使用谷歌浏览器,如果打开grafana是可以的,则不需要修改参数,参数修改视情况而定
如果集群拓扑信息不能正常显示,则单击图中"下箭头",选择"编辑"
在面板右边"可视化"菜单中,将IP地址修改为:10.1.35.209。
注意:这里的ip地址为kadb_export的地址和端口(10000),如果是内网地址,需要将kadb_export的地址映射为浏览器的外网地址后,进行修改。B端需要直接和kadb_export进行通讯
"最新警报"和"集群版本"两个监控面板也做同样的修改
如果节点资源信息信息不能正常显示,表现为监控界面打开缓慢,有如下报错,并且kadb_monitor.log日志文件有报错,不能连接192.168.0.30:10004,如图
则是kmonitor没有正确设置prometheus的地址信息,需要在上面左侧图中选择
齿轮,配置实际的prometheus地址
能curl到kadb_export:10000地址的下面信息,说明浏览器和kadb_export通讯正常
监控面板编辑页面:
节点监控页面:
主机监控页面:
- 注意事项
2.1.h2库
h2一定要起来,h2库网址连接正常
2.2.node_exporter
每个监控的ip下要有一个node_exporter目录并启动
2.3.kadb_exporter
一个集群要有一个kadb_exporter,在主节点上,并且一个kadb_exporter至少占实际内存2.5G
2.4.KADB
Kadb数据库要正常启动
2.6. 修改默认参数
选择这里的设置
选择这里的变量
修改:request_url为prometheus节点地址
2.7 频繁扫描日志造成磁盘I/O繁忙
修改kadb_export的配置文件:/home/mppadmin/centos7_amd64/kadb_exporter/conf/schedules.xml
将以下部分删除
日志抓取调度:
和磁盘数据分布相关的调度: