前言
在监控领域,ELK、Prometheus、Grafana 和 Zabbix 等是成熟的解决方案,但其架构相对复杂。对于中小型系统而言,这些工具通常需要搭配 Filebeat、Metricbeat 或各类 Exporter 等多个组件才能完成数据采集,这无疑增加了部署和运维的负担。
HertzBeat 提供了一种不同的思路。它同时支持主动探测和被动上报两种数据采集方式,在架构上更具灵活性。其核心特点可以概括为三个"统一":
- 统一的指标平台:无需 Agent,兼容 Prometheus 协议,可直接监控应用服务、数据库、缓存、操作系统、中间件、网络设备等多种目标。
- 统一的日志平台:通过 OTLP(OpenTelemetry Protocol)协议,无缝对接多种日志源进行上报。
- 统一的告警与消息分发平台 :集成内外部告警源,支持灵活的阈值规则、分组收敛、静默抑制等处理机制,并通过邮件、钉钉、微信、Webhook 等多种渠道分发通知。
这种设计的关键优势在于简化了监控系统的边缘部分。传统方案中,最繁琐的环节莫过于在每台被监控主机上部署、调试、升级和维护各种 Agent 或 Exporter。当监控规模扩大时,Agent 的版本兼容性、通信调试、批量升级等问题会显著增加运维成本。
HertzBeat 的核心原理是使用不同协议直接连接目标系统,以 PULL 模式拉取监控数据,从而避免了在目标端安装任何额外组件。
另外,对于网络隔离的环境,传统方案通常需要在每个隔离网络中部署一套独立的监控系统,导致数据孤岛和管理困难。HertzBeat 提供的云边协同能力,允许在隔离网络中部署边缘采集器,由主服务统一调度和管理,实现了跨网络监控的集中化。
接下来,我们将从源码开始,详细演示如何构建 HertzBeat 的 Docker 镜像。
准备工作:搭建编译环境
构建 HertzBeat 镜像前,需要准备一个包含 Node.js、Java 和 Maven 的编译环境。
首先,下载并配置 Node.js,用于前端资源构建:
shell
wget https://cdn.npmmirror.com/binaries/node/v24.11.1/node-v24.11.1-linux-arm64.tar.xz
接着是 Java 开发工具包(JDK),HertzBeat 后端基于 Java:
shell
wget https://mirrors.nju.edu.cn/openjdk/21.0.2/openjdk-21.0.2_linux-aarch64_bin.tar.gz
然后是 Maven,Java 项目的构建工具:
shell
wget https://dlcdn.apache.org/maven/maven-3/3.9.11/binaries/apache-maven-3.9.11-bin.zip
安装Python(前端的某些组件编译需要用到):
shell
wget https://www.python.org/ftp/python/3.14.0/Python-3.14.0.tgz
tar -xzvf Python-3.14.0.tgz
cd Python-3.14.0
编译 Python 需要安装一系列开发库:
shell
yum install -y gcc make zlib-devel bzip2-devel \
libffi-devel openssl-devel sqlite-devel \
readline-devel tk-devel xz-devel \
ncurses-devel gdbm-devel expat-devel
配置 Python 编译选项:
shell
./configure \
--prefix=/usr/local/python3.14 \
--enable-shared \
--enable-optimizations \
--with-ensurepip=yes \
CFLAGS="-fPIC -O2" \
LDFLAGS="-Wl,-rpath=/usr/local/python3.14/lib"
make -j$(nproc)
sudo make altinstall
如何系统存在多个Python环境,可以使用下面的命令切换默认环境:
shell
update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.7 1
update-alternatives --install /usr/bin/python3 python3 /usr/local/python3.14/bin/python3.14 2
update-alternatives --config python3
配置系统环境变量,使工具可被全局调用:
shell
export NODE_HOME="/data/soft/node-v22.9.0-linux-arm64"
export MAVEN_HOME="/data/soft/apache-maven-3.9.11"
export JAVA_HOME="/data/soft/jdk-21.0.2"
export NODEJS_ORG_MIRROR="https://npmmirror.com/mirrors/node/"
export PATH="$PATH:$NODE_HOME/bin:$MAVEN_HOME/bin:$JAVA_HOME/bin"
构建应用资源
1. 构建前端
HertzBeat 前端基于 Angular,进入 webapp 目录执行构建:
shell
cd webapp
npm config set registry https://registry.npmmirror.com
npm set disturl https://npmmirror.com/mirrors/node/
npm install -g yarn
rm -fr pnpm-lock.yaml
yarn install --registry=https://registry.npmmirror.com
npm install @angular/cli --save-dev
npx ng build --configuration production
2. 构建后端
回到项目根目录,使用 Maven 构建后端服务:
shell
cd hertzbeat
# 构建 HertzBeat server
mvn clean package -Prelease -DskipTests
# 构建 HertzBeat collector
mvn clean package -Pcluster -DskipTests
Maven 会处理依赖、编译代码并打包成可执行文件。
Docker 镜像构建
1. 安装 Docker Buildx
Docker Buildx 支持多架构镜像构建,首先下载并安装插件:
shell
wget https://github.com/docker/buildx/releases/download/v0.30.1/buildx-v0.30.1.linux-arm64
cp buildx-plugin $HOME/.docker/cli-plugins/docker-buildx
2. 多架构镜像构建
若需同时支持 ARM64 和 AMD64 架构:
shell
docker buildx create --name multiarch --driver docker-container --use
docker buildx inspect --bootstrap
docker buildx build --platform ${IMAGE_PLATFORM:-linux/arm64,linux/amd64} -t apache/hertzbeat:1.8.0 -f Dockerfile .
3. 单架构镜像构建
若仅需当前架构:
shell
docker pull moby/buildkit:buildx-stable-1
docker pull eclipse-temurin:21-jdk
docker build --progress=plain -t apache/hertzbeat:1.8.0 -f Dockerfile .
Dockerfile 解析
以下是一个典型的 HertzBeat Dockerfile:
dockerfile
FROM eclipse-temurin:21-jdk
MAINTAINER Apache HertzBeat "dev@hertzbeat.apache.org"
# 替换软件源为国内镜像,加速安装
RUN sed -e 's/archive.ubuntu.com/mirror.nju.edu.cn/g' \
-e 's/ports.ubuntu.com/mirror.nju.edu.cn/g' \
-e 's/^deb http[s]*:\/\/security.ubuntu.com/#&/' \
-i /etc/apt/sources.list.d/ubuntu.sources
RUN apt-get update && apt-get install -y openssh-server && apt-get install -y locales
RUN mkdir /var/run/sshd
# 构建中英文语言包
RUN localedef -c -f UTF-8 -i zh_CN zh_CN.UTF-8
RUN localedef -c -f UTF-8 -i en_US en_US.UTF-8
ENV TZ=Asia/Shanghai
ENV LANG=en_US.UTF-8
# 添加构建产物
ADD apache-hertzbeat-1.8.0-docker-bin.tar.gz /opt/
RUN chmod +x /opt/hertzbeat/bin/
EXPOSE 1157 1158 22
WORKDIR /opt/hertzbeat/
ENTRYPOINT ["./bin/entrypoint.sh"]
该 Dockerfile 基于官方 Eclipse Temurin JDK 21 镜像,替换了 APT 源以提高构建速度,安装了 SSH 和语言包,最后将编译好的 HertzBeat 二进制包添加到镜像中。注意 :entrypoint.sh 脚本需使用 LF 换行符,避免因 Windows 环境的 CRLF 导致执行失败。
配置 OpenTelemetry Collector
HertzBeat 通过 OTLP 协议接收日志数据。以下是一个 OpenTelemetry Collector 采取nginx error日志的部署示例:
yaml
version: '3'
services:
otel-collector:
image: otel/opentelemetry-collector-contrib:0.140.0
container_name: otel-collector
restart: unless-stopped
volumes:
- ./otel-collector-config.yaml:/etc/otelcol-contrib/config.yaml
- /var/log/nginx/error.log:/var/log/nginx/error.log:ro
- ./otel-storage:/opt/otel-storage
environment:
- TZ=Asia/Shanghai
ports:
- 13133:13133 # health_check extension
yaml
# otel-collector-config.yaml
receivers:
filelog:
include:
- /data/app1/logs/log_error.log
- /data/app2/logs/sys-error.log
- /data/app3/logs/app-*.log
multiline:
line_start_pattern: '^\d{4}-\d{2}-\d{2}\s+\d{2}:\d{2}:\d{2}\.\d{3}'
include_file_name: true
include_file_path: true
exclude_older_than: 24h
storage: file_storage
start_at: end
poll_interval: 30s
operators:
- type: regex_parser
regex: '^(?P<time>\d{4}-\d{2}-\d{2}\s+\d{2}:\d{2}:\d{2}\.\d{3})\s+\[(?P<thread>[^]]+)\]\s+(?P<level>\w+)'
parse_to: attributes
severity:
parse_from: attributes.level
timestamp:
parse_from: attributes.time
layout: '%Y-%m-%d %H:%M:%S.%f'
layout_type: strptime
location: 'Asia/Shanghai'
- type: filter
id: drop_error_filter
expr: 'attributes["level"] != "ERROR"'
- type: add
if: 'attributes["log.file.path"] matches "^/data/app1/"'
field: resource.service.name
value: app1
- type: add
if: 'attributes["log.file.path"] matches "^/data/app2/"'
field: resource.service.name
value: app2
- type: add
if: 'attributes["log.file.path"] matches "^/data/app3/"'
field: resource.service.name
value: app3
processors:
transform:
error_mode: ignore
log_statements:
- context: log
statements:
- set(attributes["level"], ToUpperCase(attributes["level"]))
- set(severity_text, ToUpperCase(attributes["level"]))
batch:
send_batch_size: 10
timeout: 5s
send_batch_max_size: 20
memory_limiter:
check_interval: 1s
limit_mib: 400
spike_limit_mib: 100
resourcedetection:
detectors: [env, system]
timeout: 5s
override: false
resource:
attributes:
- key: deployment.environment
value: "production"
action: upsert
- key: host.name
value: "${HOSTNAME}"
action: upsert
- key: host
value: "192.168.0.1"
action: upsert
exporters:
otlphttp:
endpoint: http://192.168.0.1:1157/api/logs/ingest/otlp
logs_endpoint: http://192.168.0.1:1157/api/logs/ingest/otlp
compression: none
encoding: json
timeout: 10s
headers:
Content-Type: application/json
Authorization: "Bearer {TOKEN}"
retry_on_failure:
enabled: true
initial_interval: 1s
max_interval: 30s
max_elapsed_time: 60s
sending_queue:
enabled: true
num_consumers: 2
queue_size: 1000
debug:
verbosity: detailed
extensions:
health_check:
endpoint: 0.0.0.0:13133
path: /
file_storage:
directory: /opt/otel-storage
create_directory: true
service:
pipelines:
logs:
receivers: [filelog]
processors: [transform, memory_limiter, resourcedetection, resource, batch]
exporters: [debug, otlphttp]
extensions: [health_check, file_storage]
Collector 的配置文件 otel-collector-config.yaml 定义了日志的接收、处理和导出规则。例如,可以配置 filelog 接收器读取多个日志文件,使用 regex_parser 解析日志内容,并通过 otlphttp 导出器将处理后的日志发送到 HertzBeat。
需要为 Collector 创建存储目录并设置正确的权限:
shell
mkdir otel-storage
chown 10001:10001 otel-storage
完整部署方案
以下是一个基于 Docker Compose 的完整部署方案,整合了 PostgreSQL(元数据)、GreptimeDB(时序数据)、Redis(缓存)和 HertzBeat:
yaml
version: "3.7"
networks:
hertzbeat:
driver: bridge
services:
postgres:
image: postgres:15
container_name: compose-postgresql
hostname: postgresql
restart: always
healthcheck:
test: [ "CMD", "pg_isready" ]
interval: 10s
timeout: 5s
retries: 5
start_period: 30s
ports:
- '15432:5432'
environment:
POSTGRES_USER: root
POSTGRES_PASSWORD: 123456
TZ: Asia/Shanghai
PGDATA: /var/lib/postgresql/data/pgdata
volumes:
- ./dbdata/pgdata:/var/lib/postgresql/data
- ./conf/sql:/docker-entrypoint-initdb.d/
networks:
- hertzbeat
greptime:
image: greptime/greptimedb:0.17.2
container_name: greptime
hostname: greptime
restart: unless-stopped
environment:
- TZ=Asia/Shanghai
ports:
- "54000:4000" # HTTP API 和 Dashboard
- "54001:4001" # gRPC 接口
- "54002:4002" # MySQL 协议
- "54003:4003" # PostgreSQL 协议
command: "standalone start --http-addr 0.0.0.0:4000 --rpc-addr 0.0.0.0:4001 --mysql-addr 0.0.0.0:4002 --postgres-addr 0.0.0.0:4003 --user-provider 'static_user_provider:file:/greptime/.env'"
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:4000/health"]
interval: 10s
retries: 5
timeout: 5s
start_period: 30s
volumes:
- ./dbdata/greptimedb_data:/greptime/greptimedb_data
- ./.env:/greptime/.env
networks:
- hertzbeat
redis:
image: redis:6.2.20-alpine
container_name: redis
hostname: redis
ports:
- "26379:6379"
volumes:
- ./redis-data:/data
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 10s
timeout: 3s
retries: 3
start_period: 30s
restart: unless-stopped
environment:
- REDIS_PASSWORD=123456
networks:
- hertzbeat
hertzbeat:
image: apache/hertzbeat:1.8.0
container_name: compose-hertzbeat
hostname: hertzbeat
restart: always
environment:
TZ: Asia/Shanghai
LANG: zh_CN.UTF-8
depends_on:
postgres:
condition: service_healthy
greptime:
condition: service_healthy
redis:
condition: service_healthy
volumes:
- ./conf/application.yml:/opt/hertzbeat/config/application.yml
- ./conf/sureness.yml:/opt/hertzbeat/config/sureness.yml
- ./logs:/opt/hertzbeat/logs
- ./ext-lib:/opt/hertzbeat/ext-lib
ports:
- "1157:1157"
- "1158:1158"
networks:
- hertzbeat
该配置文件定义了所有服务的依赖关系、健康检查、端口映射和数据持久化,确保了整个监控栈的稳定运行。
总结
本文从技术角度详细阐述了 HertzBeat 的核心设计理念,并提供了从源码构建 Docker 镜像到完整部署的全流程实践。HertzBeat 通过其无 Agent、统一平台的设计,有效降低了监控系统的部署和运维复杂度,尤其适合中小型系统或对运维效率有较高要求的场景。通过掌握其构建方法,开发者可以更灵活地将其集成到现有的技术栈中。