最近总结一些诊断OCeanBase的一些经验,出一个【OceanBase诊断调优】专题,也欢迎大家贡献自己的诊断OceanBase的方法。
1. 前言
obdiag定位为OceanBase敏捷诊断工具。1.2版本的obdiag支持诊断信息的一键收集,光有收集信息的能力,没有分析能力怎么行,所以我们在obdiag的1.3.0版本加上了OB集群的日志分析能力。你可以一键去分析你的集群的OB日志,看看有没有一些异常情况。
2. obdiag 日志分析设计
2.1 架构设计
主体架构还是依托于obdiag的集中式采集模式,当用户发起obdiag 的分析的时候需要去各个节点上进行采集,将采集回来的数据集中进行分析处理。
2.2 obdiag执行在线日志分析的时序图
-
用户设置配置文件,配置文件的路径在obdiag安装目录的config/config.yml中,主要是设置所要分析的OceanBase集群的ssh登陆信息,因为obdiag需要通过ssh方式去集群拉取日志到obdiag的节点上进行分析
-
执行obdiag analyze log <option> 命令
-
obdiag 接收到用户的analyze命令后会去解析<option> 内的参数
-
obdiag解析完analyze参数后会启动日志拉取的环节,拉取的节点是步骤一中用户配置的,拉取的日志的时间范围、过滤条件等都是步骤三<option>设定的
-
obdiag 发送远程主机的执行指令
-
远程执行日志的grep或者cp命令来获取日志
-
符合条件的日志会统一放到临时文件中,便于后续的回传
-
下载远程主机上筛选出来的符合条件的日志
-
下载完毕后,发送临时文件清理指令
-
远程主机临时文件会被清理
-
obdiag 对远程主机拉取回来的日志文件进行分析,对于日志分析,主要规则是针对日志中的retcode进行分析,统计各retcode出现的次数、最早开始时间、最晚出现的时间以及其对应的trace_id的等信息
-
obdiag分析完日志后会在黑屏上打印出总览的日志分析信息
-
obdiag分析日志的详细信息会输出到文件中
-
用户可以通过obdiag 输出的文件地址查看详细的日志分析报告
3. obdiag日志分析实践
obdiag analyze <analyze type> [options]
analyze type
包含如下:
- log:一键分析 OceanBase 的日志。
3.1 obdiag analyze log
使用该命令可以一键在线分析 OceanBase 集群的日志,或者通过 --files
开启离线分析模式。
- 本文所指的在线分析指的是 OceanBase 集群在线运行状态,日志分布在各个 OBServer 节点上。
- 本文所指的离线分析模式是
--files
参数传递下,可以分析已经收集到机 obdiag 部署机器上的 OBServer 节点日志。 - 需要确保已经在 obdiag 配置文件
config.yml
中配置好需要收集节点的登录信息。相关的详细配置介绍,参见 obdiag 配置。
例子:
obdiag analyze log --scope observer --from 2023-10-08 10:25:00 --to 2023-10-08 11:30:00
...
FileListInfo:
+----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Node | LogList |
+================+=======================================================================================================================================================================================================================+
| xx.xx.xx.xx | ['observer.log.20231008104204260', 'observer.log.20231008111305072', 'observer.log.20231008114410668', 'observer.log.wf.20231008104204260', 'observer.log.wf.20231008111305072', 'observer.log.wf.20231008114410668'] |
+----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
...
Analyze OceanBase Online Log Summary:
+----------------+-----------+------------------------------------------------------------------------------+-------------+-------------------------------------------------------------------------------------------------------------------------------+---------+
| Node | Status | FileName | ErrorCode | Message | Count |
+================+===========+==============================================================================+=============+===============================================================================================================================+=========+
| xx.xx.xx.xx | Completed | analyze_pack_20231008171201/xx_xx_xx_xx/observer.log.20231008104204260 | -5006 | You have an error in your SQL syntax; check the manual that corresponds to your OceanBase version for the right syntax to use | 2 |
+----------------+-----------+------------------------------------------------------------------------------+-------------+-------------------------------------------------------------------------------------------------------------------------------+---------+
| xx.xx.xx.xx | Completed | analyze_pack_20231008171201/xx_xx_xx_xx/observer.log.20231008111305072 | -5006 | You have an error in your SQL syntax; check the manual that corresponds to your OceanBase version for the right syntax to use | 8 |
+----------------+-----------+------------------------------------------------------------------------------+-------------+-------------------------------------------------------------------------------------------------------------------------------+---------+
| xx.xx.xx.xx | Completed | analyze_pack_20231008171201/xx_xx_xx_xx/observer.log.20231008114410668 | -5006 | You have an error in your SQL syntax; check the manual that corresponds to your OceanBase version for the right syntax to use | 10 |
+----------------+-----------+------------------------------------------------------------------------------+-------------+-------------------------------------------------------------------------------------------------------------------------------+---------+
| xx.xx.xx.xx | Completed | analyze_pack_20231008171201/xx_xx_xx_xx/observer.log.20231008114410668 | -4009 | IO error | 20 |
+----------------+-----------+------------------------------------------------------------------------------+-------------+-------------------------------------------------------------------------------------------------------------------------------+---------+
For more details, please run cmd 'cat analyze_pack_20231008171201/result_details.txt'
快捷分析最近一段时间的日志:
在线分析最近一小时的日志,该指令执行的时候会从远程主机上拉取最近一小时的日志进行分析,诊断出出现过的错误
obdiag gather log --scope observer --since 1h
# 在线分析最近 30 分钟的日志,该指令执行的时候会从远程主机上拉取最近30分钟的日志进行分析,诊断出出现过的错误
obdiag analyze log --scope observer --since 30m
离线分析日志:
ls -lh test/
-rw-r--r-- 1 admin staff 256M Oct 8 17:24 observer.log.20231008104204260
-rw-r--r-- 1 admin staff 256M Oct 8 17:24 observer.log.20231008111305072
-rw-r--r-- 1 admin staff 256M Oct 8 17:24 observer.log.20231008114410668
-rw-r--r-- 1 admin staff 18K Oct 8 17:24 observer.log.wf.20231008104204260
-rw-r--r-- 1 admin staff 19K Oct 8 17:24 observer.log.wf.20231008111305072
-rw-r--r-- 1 admin staff 18K Oct 8 17:24 observer.log.wf.20231008114410668
obdiag analyze log --files test/
Analyze OceanBase Offline Log Summary:
+-----------+-----------+-----------------------------------------------------------------------+-------------+-------------------------------------------------------------------------------------------------------------------------------+---------+
| Node | Status | FileName | ErrorCode | Message | Count |
+===========+===========+=======================================================================+=============+===============================================================================================================================+=========+
| 127.0.0.1 | Completed | analyze_pack_20231008172144/127_0_0_1_/observer.log.20231008104204260 | -5006 | You have an error in your SQL syntax; check the manual that corresponds to your OceanBase version for the right syntax to use | 2 |
+-----------+-----------+-----------------------------------------------------------------------+-------------+-------------------------------------------------------------------------------------------------------------------------------+---------+
| 127.0.0.1 | Completed | analyze_pack_20231008172144/127_0_0_1_/observer.log.20231008111305072 | -5006 | You have an error in your SQL syntax; check the manual that corresponds to your OceanBase version for the right syntax to use | 8 |
+-----------+-----------+-----------------------------------------------------------------------+-------------+-------------------------------------------------------------------------------------------------------------------------------+---------+
| 127.0.0.1 | Completed | analyze_pack_20231008172144/127_0_0_1_/observer.log.20231008114410668 | -5006 | You have an error in your SQL syntax; check the manual that corresponds to your OceanBase version for the right syntax to use | 10 |
+-----------+-----------+-----------------------------------------------------------------------+-------------+-------------------------------------------------------------------------------------------------------------------------------+---------+
| 127.0.0.1 | Completed | analyze_pack_20231008172144/127_0_0_1_/observer.log.20231008114410668 | -4009 | IO error | 20 |
+-----------+-----------+-----------------------------------------------------------------------+-------------+-------------------------------------------------------------------------------------------------------------------------------+---------+
For more details, please run cmd 'cat analyze_pack_20231008172144/result_details.txt'