深度解析DolphinScheduler核心架构:搭建高可用Zookeeper集群

一:环境准备

|----------------|-----------|
| 192.168.67.137 | zookeeper |
| 192.168.67.138 | zookeeper |
| 192.168.67.139 | zookeeper |

二:安装Zookeeper

1:下载Zookeeper

复制代码
[root@localhost ~]# wget https://archive.apache.org/dist/zookeeper/zookeeper-3.4.14/zookeeper-3.4.14.tar.gz
--2026-02-04 22:15:20--  https://archive.apache.org/dist/zookeeper/zookeeper-3.4.14/zookeeper-3.4.14.tar.gz
Resolving archive.apache.org (archive.apache.org)... 65.108.204.189, 2a01:4f9:1a:a084::2
Connecting to archive.apache.org (archive.apache.org)|65.108.204.189|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 37676320 (36M) [application/x-gzip]
Saving to: 'zookeeper-3.4.14.tar.gz'

2:解压

复制代码
[root@localhost src]# tar -zxvf zookeeper-3.4.14.tar.gz
zookeeper-3.4.14/
zookeeper-3.4.14/bin/
zookeeper-3.4.14/bin/README.txt
zookeeper-3.4.14/bin/zkCleanup.sh
zookeeper-3.4.14/bin/zkCli.cmd
zookeeper-3.4.14/bin/zkCli.sh
zookeeper-3.4.14/bin/zkEnv.cmd
zookeeper-3.4.14/bin/zkEnv.sh
zookeeper-3.4.14/bin/zkServer.cmd
zookeeper-3.4.14/bin/zkServer.sh
zookeeper-3.4.14/bin/zkTxnLogToolkit.cmd
zookeeper-3.4.14/bin/zkTxnLogToolkit.sh
zookeeper-3.4.14/conf/
zookeeper-3.4.14/conf/configuration.xsl
zookeeper-3.4.14/conf/log4j.properties
zookeeper-3.4.14/conf/zoo_sample.cfg
zookeeper-3.4.14/dist-maven/
zookeeper-3.4.14/dist-maven/zookeeper-3.4.14-javadoc.jar
zookeeper-3.4.14/dist-maven/zookeeper-3.4.14-javadoc.jar.md5
zookeeper-3.4.14/dist-maven/zookeeper-3.4.14-javadoc.jar.sha1
zookeeper-3.4.14/dist-maven/zookeeper-3.4.14-sources.jar
zookeeper-3.4.14/dist-maven/zookeeper-3.4.14-sources.jar.md5
zookeeper-3.4.14/dist-maven/zookeeper-3.4.14-sources.jar.sha1
zookeeper-3.4.14/dist-maven/zookeeper-3.4.14-tests.jar

gzip: stdin: unexpected end of file
tar: Unexpected EOF in archive
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now
[root@localhost src]#

3:修改配置文件

复制代码
[root@localhost conf]# cp zoo_sample.cfg zoo.cfg
[root@localhost conf]#

下载下来的默认配置如下:

好的,请提供需要润色的文本内容,我会为您进行专业的语言优化,保持原意不变的同时提升表达的流畅性和清晰度。

​tickTime=2000 :

通俗点叫滴答时间,就是心跳间隔,默认是2000毫秒,即每隔两秒心跳一次。

tickTime用于客户端与服务器或服务器与服务器之间维持心跳的时间度量单位,即每隔tickTime会发送一次心跳。

心跳的作用:

监听机器的工作状态。通过心跳来控制follower跟leader的通信时间,默认情况下他们

(follower和leader)的会话时长是心跳间隔的两倍,即2 * tickTime。

initLimit=10:

follower在启动过程中,会从leader同步所有最新数据,然后确定自己能够对外服务的起始状态,

leader允许follower在initLimit时间内完成工作。默认值是10,即10*tickTime。

默认情况下不需要修改该配置项,随着ZooKeeper集群管理的数量不断增大,follower节点在启动的时候,

从leader节点进行数据同步的时间也会相应变长,于是无法在较短的时间内完成数据同步,

在这种情况下,需要适当调大这个参数。

syncLimit=5:

leader节点和follower节点进行心跳检测的最大延迟时间。在ZooKeeper集群中,

leader节点会与所有的follower节点进行心跳检测来确认节点是否存活。默认值为5,即5*tickTime。

dataDir=/tmp/zookeeper:

ZooKeeper服务器存储快照文件的默认目录。/tmp目录下的文件可能被自动删除,容易丢失,需要修改存放目录。

clientPort=2181:

客户端连接ZooKeeper服务器的端口。ZooKeeper会监听这个端口,接收客户端的访问请求。

修改之后的配置文件:

复制代码
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/usr/local/src/zookeeper-3.4.14/data
dataLogDir=/usr/local/src/zookeeper-3.4.14/log
clientPort=2181

4:启动zookeeper

zkCleanup.sh :用于清理ZooKeeper的历史数据,包括事务日志文件与快照数据文件

zkCli.sh:连接ZooKeeper服务器的命令行客户端

zkEnv.sh:设置环境变量

zkServer.sh:启动ZooKeeper服务器

复制代码
[root@localhost bin]# ./zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /usr/local/src/zookeeper-3.4.14/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[root@localhost bin]#

5:查看zookeeper状态

复制代码
[root@localhost bin]# cat zookeeper.out
2026-02-05 00:13:17,209 [myid:] - INFO  [main:QuorumPeerConfig@136] - Reading configuration from: /usr/local/src/zookeeper-3.4.14/bin/../conf/zoo.cfg
2026-02-05 00:13:17,213 [myid:] - INFO  [main:DatadirCleanupManager@78] - autopurge.snapRetainCount set to 3
2026-02-05 00:13:17,213 [myid:] - INFO  [main:DatadirCleanupManager@79] - autopurge.purgeInterval set to 0
2026-02-05 00:13:17,213 [myid:] - INFO  [main:DatadirCleanupManager@101] - Purge task is not scheduled.
2026-02-05 00:13:17,214 [myid:] - WARN  [main:QuorumPeerMain@116] - Either no config or no quorum defined in config, running  in standalone mode
2026-02-05 00:13:17,220 [myid:] - INFO  [main:QuorumPeerConfig@136] - Reading configuration from: /usr/local/src/zookeeper-3.4.14/bin/../conf/zoo.cfg
2026-02-05 00:13:17,221 [myid:] - INFO  [main:ZooKeeperServerMain@98] - Starting server
2026-02-05 00:13:17,226 [myid:] - INFO  [main:Environment@100] - Server environment:zookeeper.version=3.4.14-4c25d480e66aadd371de8bd2fd8da255ac140bcf, built on 03/06/2019 16:18 GMT
2026-02-05 00:13:17,226 [myid:] - INFO  [main:Environment@100] - Server environment:host.name=localhost
2026-02-05 00:13:17,226 [myid:] - INFO  [main:Environment@100] - Server environment:java.version=1.8.0_262
2026-02-05 00:13:17,226 [myid:] - INFO  [main:Environment@100] - Server environment:java.vendor=Oracle Corporation
2026-02-05 00:13:17,226 [myid:] - INFO  [main:Environment@100] - Server environment:java.home=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.262.b10-1.el7.x86_64/jre
2026-02-05 00:13:17,226 [myid:] - INFO  [main:Environment@100] - Server environment:java.class.path=/usr/local/src/zookeeper-3.4.14/bin/../zookeeper-server/target/classes:/usr/local/src/zookeeper-3.4.14/bin/../build/classes:/usr/local/src/zookeeper-3.4.14/bin/../zookeeper-server/target/lib/*.jar:/usr/local/src/zookeeper-3.4.14/bin/../build/lib/*.jar:/usr/local/src/zookeeper-3.4.14/bin/../lib/slf4j-log4j12-1.7.25.jar:/usr/local/src/zookeeper-3.4.14/bin/../lib/slf4j-api-1.7.25.jar:/usr/local/src/zookeeper-3.4.14/bin/../lib/netty-3.10.6.Final.jar:/usr/local/src/zookeeper-3.4.14/bin/../lib/log4j-1.2.17.jar:/usr/local/src/zookeeper-3.4.14/bin/../lib/jline-0.9.94.jar:/usr/local/src/zookeeper-3.4.14/bin/../lib/audience-annotations-0.5.0.jar:/usr/local/src/zookeeper-3.4.14/bin/../zookeeper-3.4.14.jar:/usr/local/src/zookeeper-3.4.14/bin/../zookeeper-server/src/main/resources/lib/*.jar:/usr/local/src/zookeeper-3.4.14/bin/../conf:
2026-02-05 00:13:17,226 [myid:] - INFO  [main:Environment@100] - Server environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2026-02-05 00:13:17,227 [myid:] - INFO  [main:Environment@100] - Server environment:java.io.tmpdir=/tmp
2026-02-05 00:13:17,227 [myid:] - INFO  [main:Environment@100] - Server environment:java.compiler=<NA>
2026-02-05 00:13:17,227 [myid:] - INFO  [main:Environment@100] - Server environment:os.name=Linux
2026-02-05 00:13:17,227 [myid:] - INFO  [main:Environment@100] - Server environment:os.arch=amd64
2026-02-05 00:13:17,227 [myid:] - INFO  [main:Environment@100] - Server environment:os.version=3.10.0-1160.el7.x86_64
2026-02-05 00:13:17,228 [myid:] - INFO  [main:Environment@100] - Server environment:user.name=root
2026-02-05 00:13:17,228 [myid:] - INFO  [main:Environment@100] - Server environment:user.home=/root
2026-02-05 00:13:17,228 [myid:] - INFO  [main:Environment@100] - Server environment:user.dir=/usr/local/src/zookeeper-3.4.14/bin
2026-02-05 00:13:17,235 [myid:] - INFO  [main:ZooKeeperServer@836] - tickTime set to 2000
2026-02-05 00:13:17,235 [myid:] - INFO  [main:ZooKeeperServer@845] - minSessionTimeout set to -1
2026-02-05 00:13:17,235 [myid:] - INFO  [main:ZooKeeperServer@854] - maxSessionTimeout set to -1
2026-02-05 00:13:17,246 [myid:] - INFO  [main:ServerCnxnFactory@117] - Using org.apache.zookeeper.server.NIOServerCnxnFactory as server connection factory

三:搭建Zookeeper集群

1:单机添加集群节点配置

复制代码
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/usr/local/src/zookeeper-3.4.14/data
dataLogDir=/usr/local/src/zookeeper-3.4.14/log
clientPort=2181

# 集群化配置
server.1=192.168.67.137:2881:3881
server.2=192.168.67.138:2881:3881
server.3=192.168.67.139:2881:3881

2:配置其他节点

照着上述配置即可。

3:添加集群Pid

复制代码
echo 1 > /usr/local/src/zookeeper-3.4.14/data/myid
echo 2 > /usr/local/src/zookeeper-3.4.14/data/myid
echo 3 > /usr/local/src/zookeeper-3.4.14/data/myid

4:查看各个节点启动状态

192.168.67.137

复制代码
[root@localhost bin]# tail -fn 5 zookeeper.out
2026-02-05 15:43:29,253 [myid:1] - INFO  [LearnerHandler-/192.168.67.139:53034:LearnerHandler@401] - Synchronizing with Follower sid: 3 maxCommittedLog=0x2 minCommittedLog=0x1 peerLastZxid=0x0
2026-02-05 15:43:29,253 [myid:1] - WARN  [LearnerHandler-/192.168.67.139:53034:LearnerHandler@468] - Unhandled proposal scenario
2026-02-05 15:43:29,253 [myid:1] - INFO  [LearnerHandler-/192.168.67.139:53034:LearnerHandler@475] - Sending SNAP
2026-02-05 15:43:29,253 [myid:1] - INFO  [LearnerHandler-/192.168.67.139:53034:LearnerHandler@499] - Sending snapshot last zxid of peer is 0x0  zxid of leader is 0x100000000sent zxid of db as 0x100000000
2026-02-05 15:43:29,261 [myid:1] - INFO  [LearnerHandler-/192.168.67.139:53034:LearnerHandler@535] - Received NEWLEADER-ACK message from 3

[root@localhost bin]# ./zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /usr/local/src/zookeeper-3.4.14/bin/../conf/zoo.cfg
Mode: leader
[root@localhost bin]#

如何从日志中分析出来他是leader

从日志和 ZooKeeper 的核心运行机制、专属组件 / 日志关键字 ,能精准判定myid=1是 Leader 节点,核心依据有 3 点,结合日志逐一拆解,你一看就能懂:

核心依据 1:ZooKeeper 专属通信组件LearnerHandler(Leader 独有)

日志中出现关键字:[LearnerHandler-/192.168.67.139:53034:LearnerHandler@401]

  • ZooKeeper 的集群架构中,LearnerHandler是 Leader 节点独有的处理器,专门用于和集群中的 **Learner(包含 Follower 从节点、Observer 观察者节点)** 建立连接、处理通信、同步数据;
  • Follower/Observer 节点不会有LearnerHandler相关日志,它们和 Leader 通信用的是LeaderConnector等组件,这是角色判定的核心标志

核心依据 2:Leader 向 Follower 主动发起数据同步的行为(Leader 独有动作)

日志中的同步关键字:Synchronizing with Follower sid: 3Sending SNAP

  • ZooKeeper 集群中,只有 Leader 节点有权向 Follower 主动发送数据快照(SNAP)、同步事务日志
  • Follower 节点只会被动接收Leader 的同步数据,不会主动向其他节点发起同步操作;
  • 日志明确显示该节点在给sid=3(节点 3)同步数据、发快照,这是 Leader 的典型行为。

核心依据 3:Follower 向 Leader 回复的NEWLEADER-ACK确认包(Leader 接收方)

日志关键句:Received NEWLEADER-ACK message from 3

  • ZooKeeper 的 Leader 选举流程中,当 Leader 节点选举成功后,会向所有 Follower 发送NEWLEADER通知
  • Follower 收到通知后,会向 Leader 回复NEWLEADER-ACK确认包,表示认可该节点为 Leader;
  • 只有 Leader 节点会接收 这个 ACK 确认包,Follower 节点只会发送ACK 给 Leader,不会接收其他节点的 ACK,这是选举完成后 Leader 的标志性日志。

补充:排除无害警告的干扰

日志中的Unhandled proposal scenario全新集群的正常无意义警告,因为集群刚初始化、无任何历史事务日志和提案,不影响角色判定,可直接忽略。

一句话总结判定逻辑

该节点出现Leader 独有的组件(LearnerHandler)+ Leader 独有的主动同步行为 + 接收 Follower 的选举确认 ACK,这三个特征同时满足,是判定其为 Leader 节点的铁证。

拓展:快速判定 Follower 节点的日志特征

如果是 Follower 节点,日志会出现这些关键字,和 Leader 形成明显区别:

  • LeaderConnector:Follower 主动连接 Leader 的组件;
  • Received NEWLEADER message from leader:被动接收 Leader 的选举通知;
  • Sending NEWLEADER-ACK to leader:向 Leader 发送确认包;
  • Syncing with leader:被动和 Leader 同步数据。

192.168.67.138

复制代码
[root@localhost bin]# tail -fn 5 zookeeper.out
2026-02-05 15:43:22,321 [myid:2] - INFO  [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:Learner@336] - Getting a snapshot from leader 0x2
2026-02-05 15:43:22,324 [myid:2] - INFO  [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:FileTxnSnapLog@301] - Snapshotting: 0x2 to /usr/local/src/zookeeper-3.4.14/data/version-2/snapshot.2
2026-02-05 15:43:29,241 [myid:2] - INFO  [/192.168.67.138:3881:QuorumCnxManager$Listener@743] - Received connection request /192.168.67.139:33334
2026-02-05 15:43:29,243 [myid:2] - INFO  [WorkerReceiver[myid=2]:FastLeaderElection@595] - Notification: 1 (message format version), 3 (n.leader), 0x0 (n.zxid), 0x1 (n.round), LOOKING (n.state), 3 (n.sid), 0x0 (n.peerEpoch) FOLLOWING (my state)
2026-02-05 15:43:29,243 [myid:2] - INFO  [WorkerReceiver[myid=2]:FastLeaderElection@595] - Notification: 1 (message format version), 1 (n.leader), 0x2 (n.zxid), 0x1 (n.round), LOOKING (n.state), 3 (n.sid), 0x0 (n.peerEpoch) FOLLOWING (my state)


[root@localhost bin]# ./zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /usr/local/src/zookeeper-3.4.14/bin/../conf/zoo.cfg
Mode: follower
[root@localhost bin]#

192.168.67.139

复制代码
[root@localhost bin]# tail -fn 5 zookeeper.out
2026-02-05 15:43:29,248 [myid:3] - INFO  [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:ZooKeeperServer@174] - Created server with tickTime 2000 minSessionTimeout 4000 maxSessionTimeout 40000 datadir /usr/local/src/zookeeper-3.4.14/log/version-2 snapdir /usr/local/src/zookeeper-3.4.14/data/version-2
2026-02-05 15:43:29,249 [myid:3] - INFO  [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:Follower@65] - FOLLOWING - LEADER ELECTION TOOK - 18
2026-02-05 15:43:29,249 [myid:3] - INFO  [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:QuorumPeer$QuorumServer@185] - Resolved hostname: 192.168.67.137 to address: /192.168.67.137
2026-02-05 15:43:29,254 [myid:3] - INFO  [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:Learner@336] - Getting a snapshot from leader 0x100000000
2026-02-05 15:43:29,259 [myid:3] - INFO  [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:FileTxnSnapLog@301] - Snapshotting: 0x100000000 to /usr/local/src/zookeeper-3.4.14/data/version-2/snapshot.100000000


[root@localhost bin]# ./zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /usr/local/src/zookeeper-3.4.14/bin/../conf/zoo.cfg
Mode: follower
[root@localhost bin]#
相关推荐
Monly212 小时前
Linux:分包上传文件
linux
顶点多余2 小时前
版本控制器-git
linux·git
前进吧-程序员2 小时前
【硬核架构】IO 巅峰对决:Linux epoll vs Windows IOCP vs 新皇 io_uring
linux·服务器
路由侠内网穿透.2 小时前
本地部署家庭自动化系统 Domoticz 并实现外部访问( Linux 版本)
linux·运维·服务器·网络协议·自动化
IT北辰2 小时前
VMware Workstation虚拟机kali环境如何连接usb网卡RT3070
linux
郝亚军2 小时前
Ubuntu启一个tcp server,client去连接
linux·服务器·数据库
yxy___2 小时前
达梦分布式集群DPC_影子和实体副本相互转换(DEM)_yxy
分布式·dem·影子副本
努力有什么不好2 小时前
Hadoop3.2.2伪分布式搭建
大数据·hadoop·分布式
Trouvaille ~2 小时前
【Linux】UDP Socket编程实战(四):地址转换函数深度解析
linux·服务器·网络·c++·udp·socket·地址转换函数