etcd 3.15 三节点集群管理指南

本文档旨在提供 etcd 3.15 版本的三节点集群管理指南，涵盖节点的新增、删除、状态检查、数据库备份和恢复等操作。

1. 环境准备

1.1 系统要求

操作系统：Linux（推荐 Ubuntu 18.04 或 CentOS 7）
内存：至少 2GB
磁盘：至少 10GB 可用空间
网络：节点之间能够互相通信

1.2 软件要求

etcd 3.15 版本
curl 或 etcdctl 工具

1.3 节点信息

假设我们有三台服务器，IP 地址分别为：

Node1: 192.168.1.101
Node2: 192.168.1.102
Node3: 192.168.1.103

2. 安装 etcd

在所有节点上安装 etcd 3.15 版本。

复制代码

# 下载 etcd
wget https://github.com/etcd-io/etcd/releases/download/v3.15.0/etcd-v3.15.0-linux-amd64.tar.gz
# 解压
tar -xvf etcd-v3.15.0-linux-amd64.tar.gz
# 移动到 /usr/local/bin
sudo mv etcd-v3.15.0-linux-amd64/etcd* /usr/local/bin/
# 验证安装
etcd --version

3. 配置三节点集群

3.1 启动 etcd 集群

在每个节点上启动 etcd，使用以下命令：

Node1 (192.168.1.101)

复制代码

etcd --name node1 \
--data-dir /var/lib/etcd \
--initial-advertise-peer-urls http://192.168.1.101:2380 \
--listen-peer-urls http://192.168.1.101:2380 \
--listen-client-urls http://192.168.1.101:2379,http://127.0.0.1:2379 \
--advertise-client-urls http://192.168.1.101:2379 \
--initial-cluster-token etcd-cluster-1 \
--initial-cluster node1=http://192.168.1.101:2380,node2=http://192.168.1.102:2380,node3=http://192.168.1.103:2380 \
--initial-cluster-state new

Node2 (192.168.1.102)

复制代码

etcd --name node2 \
--data-dir /var/lib/etcd \
--initial-advertise-peer-urls http://192.168.1.102:2380 \
--listen-peer-urls http://192.168.1.102:2380 \
--listen-client-urls http://192.168.1.102:2379,http://127.0.0.1:2379 \
--advertise-client-urls http://192.168.1.102:2379 \
--initial-cluster-token etcd-cluster-1 \
--initial-cluster node1=http://192.168.1.101:2380,node2=http://192.168.1.102:2380,node3=http://192.168.1.103:2380 \
--initial-cluster-state new

Node3 (192.168.1.103)

复制代码

etcd --name node3 \
--data-dir /var/lib/etcd \
--initial-advertise-peer-urls http://192.168.1.103:2380 \
--listen-peer-urls http://192.168.1.103:2380 \
--listen-client-urls http://192.168.1.103:2379,http://127.0.0.1:2379 \
--advertise-client-urls http://192.168.1.103:2379 \
--initial-cluster-token etcd-cluster-1 \
--initial-cluster node1=http://192.168.1.101:2380,node2=http://192.168.1.102:2380,node3=http://192.168.1.103:2380 \
--initial-cluster-state new

3.2 验证集群状态

使用 etcdctl 检查集群状态：

复制代码

etcdctl --endpoints=http://192.168.1.101:2379,http://192.168.1.102:2379,http://192.168.1.103:2379 endpoint status

输出应显示三个节点的健康状态。

3.3 使用服务方式进行创建

创建 etcd.service 服务托管于 systemd

vim /usr/lib/systemd/system/etcd.service

复制代码

[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target

[Service]
Type=notify
# 指定环境变量所在 
EnvironmentFile=-/etc/etcd/config
ExecStart=/usr/local/bin/etcd \
--name=${ETCD_NAME} \
--data-dir=${ETCD_DATA_DIR} \
--listen-peer-urls=${ETCD_LISTEN_PEER_URLS} \
--listen-client-urls=${ETCD_LISTEN_CLIENT_URLS},http://127.0.0.1:2379 \
--advertise-client-urls=${ETCD_ADVERTISE_CLIENT_URLS} \
--initial-advertise-peer-urls=${ETCD_INITIAL_ADVERTISE_PEER_URLS} \
--initial-cluster=${ETCD_INITIAL_CLUSTER} \
--initial-cluster-token=${ETCD_INITIAL_CLUSTER_TOKEN} \
--initial-cluster-state=new 
Restart=on-failure
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target

创建 etcd 配置文件

mkdir /etc/etcd

vim /etc/etcd/config

复制代码

#[Member]
##节点名字,每个节点都进行修改
ETCD_NAME="etcd01" 
#数据目录
ETCD_DATA_DIR="/hskj/etcd/"
#当前节点的ip地址,每个节点都进行修改
ETCD_LISTEN_PEER_URLS="http://192.168.1.101:2380" 
ETCD_LISTEN_CLIENT_URLS="http://192.168.1.101:2379"

#[Clustering]
#每个节点都要修改
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.1.101:2380"
ETCD_ADVERTISE_CLIENT_URLS="http://192.168.1.101:2379"
#集群所有的节点的ip地址
ETCD_INITIAL_CLUSTER="etcd01=http://192.168.1.101:2380,etcd02=http://192.168.1.102:2380,etcd03=http://192.168.1.103:2380" 
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
#新增加的节点需要将参数变为existing
ETCD_INITIAL_CLUSTER_STATE="new"

启动 etcd

复制代码

systemctl deamon-reload
systemctl enable etcd
systemctl start etcd

4. 节点管理

4.1 新增节点

假设要新增一个节点 Node4 (192.168.1.104)。

4.1.1 启动新节点

在 Node4 上启动 etcd：

复制代码

etcd --name node4 \
--data-dir /var/lib/etcd \
--initial-advertise-peer-urls http://192.168.1.104:2380 \
--listen-peer-urls http://192.168.1.104:2380 \
--listen-client-urls http://192.168.1.104:2379,http://127.0.0.1:2379 \
--advertise-client-urls http://192.168.1.104:2379 \
--initial-cluster-token etcd-cluster-1 \
--initial-cluster node1=http://192.168.1.101:2380,node2=http://192.168.1.102:2380,node3=http://192.168.1.103:2380,node4=http://192.168.1.104:2380 \
--initial-cluster-state existing

4.1.2 更新集群配置

在现有集群的任一节点上，使用 etcdctl 更新集群配置：

复制代码

etcdctl member add node4 --peer-urls=http://192.168.1.104:2380

4.2 删除节点

假设要删除 Node3 (192.168.1.103)。

4.2.1 获取节点 ID

首先获取 Node3 的成员 ID：

复制代码

etcdctl member list

4.2.2 删除节点

使用 etcdctl 删除节点：

复制代码

etcdctl member remove <node3-member-id>

5. 状态检查

5.1 检查集群健康状态

复制代码

etcdctl --endpoints=http://192.168.1.101:2379,http://192.168.1.102:2379,http://192.168.1.103:2379 endpoint health

5.2 检查集群成员列表

复制代码

etcdctl member list

6. 数据库备份与恢复

6.1 备份数据库

使用 etcdctl 备份数据库：

复制代码

etcdctl --endpoints=http://192.168.1.101:2379 snapshot save /path/to/backup.db

6.2 恢复数据库

停止所有 etcd 服务，然后使用备份文件恢复数据库：

复制代码

etcdctl snapshot restore /path/to/backup.db \
--name node1 \
--data-dir /var/lib/etcd \
--initial-advertise-peer-urls http://192.168.1.101:2380 \
--initial-cluster-token etcd-cluster-1 \
--initial-cluster node1=http://192.168.1.101:2380,node2=http://192.168.1.102:2380,node3=http://192.168.1.103:2380 \
--initial-cluster-state new

恢复后，重新启动 etcd 服务。

7. 总结

本文档提供了 etcd 3.15 版本的三节点集群管理指南，涵盖了节点的新增、删除、状态检查、数据库备份和恢复等操作。通过遵循这些步骤，您可以有效地管理和维护 etcd 集群。