安装etcdctl
准备看下etcd如何命令行操作,才发现,主机上,只用kubeadm拉起了etcd,但没有etcdctl命令。
bash
# sudo docker ps -a | awk '/etcd-master/{print $1}'
c4e3a57f05d7
26a11608b270
836dabc8e254
找到正在运行的etcd,将pod中的etcdctl命令复制到主机上,使得在主机上就能直接使用etcdctl命令。
bash
# sudo docker cp c4e3a57f05d7:/usr/local/bin/etcdctl /usr/local/bin/etcdctl
在此执行etcdctl 命令,已成功执行
bash
# etcdctl
NAME:
etcdctl - A simple command line client for etcd3.
USAGE:
etcdctl [flags]
VERSION:
3.5.1
API VERSION:
3.5
查看etcd节点成员
bash
# etcdctl --endpoints=https://[127.0.0.1]:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/healthcheck-client.crt --key=/etc/kubernetes/pki/etcd/healthcheck-client.key member list -w table
+-----------------+---------+-------------------+---------------------------+---------------------------+------------+
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER |
+-----------------+---------+-------------------+---------------------------+---------------------------+------------+
| dd7a929be676b37 | started | | https://192.168.1.120:2380 | https://192.168.1.120:2379 | false |
+-----------------+---------+-------------------+---------------------------+---------------------------+------------+
etcd的证书列表
bash
# ll /etc/kubernetes/pki/etcd/
total 32
-rw-r----- 1 root root 1086 Mar 26 16:52 ca.crt
-rw------- 1 root root 1675 Mar 26 16:52 ca.key
-rw-r----- 1 root root 1159 Mar 26 16:52 healthcheck-client.crt
-rw------- 1 root root 1675 Mar 26 16:52 healthcheck-client.key
-rw-r----- 1 root root 1220 Mar 26 16:52 peer.crt
-rw------- 1 root root 1675 Mar 26 16:52 peer.key
-rw-r----- 1 root root 1220 Mar 26 16:52 server.crt
-rw------- 1 root root 1675 Mar 26 16:52 server.key
etcdctl设置别名
bash
# alias etcdctl='etcdctl --endpoints=https://[127.0.0.1]:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/healthcheck-client.crt --key=/etc/kubernetes/pki/etcd/healthcheck-client.key'
[root@192.168.1.120 ~]#
[root@192.168.1.120 ~]# etcdctl member list -w table
+-----------------+---------+-------------------+---------------------------+---------------------------+------------+
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER |
+-----------------+---------+-------------------+---------------------------+---------------------------+------------+
| dd7a929be676b37 | started | 192.168.1.120 | https://192.168.1.120:2380 | https://192.168.1.120:2379 | false |
+-----------------+---------+-------------------+---------------------------+---------------------------+------------+
查看etcd的详情
IS LEADER:当前是主节点
RAFT TERM: 做了多少轮的选举
bash
# etcdctl endpoint status -w table
+--------------------------+-----------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+--------------------------+-----------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| https://[127.0.0.1]:2379 | dd7a929be676b37 | 3.5.1 | 18 MB | true | false | 22 | 7579742 | 7579742 | |
+--------------------------+-----------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
查看etcd是否健康
bash
# etcdctl endpoint health
https://[127.0.0.1]:2379 is healthy: successfully committed proposal: took = 27.76824ms
操作表小试牛刀
etcd中的数据,存放的目录从 / 开始
bash
# etcdctl put /skywell/byd bus
OK
# etcdctl get /skywell/ --prefix=true
/skywell/byd
bus
查看表中数据
--keys-only: 只查看key,不看value
--limit: 表中有很多,限制只查询几条
bash
# etcdctl get / --prefix=true --keys-only --limit 10
/registry/apiregistration.k8s.io/apiservices/v1.
/registry/apiregistration.k8s.io/apiservices/v1.admissionregistration.k8s.io
/registry/apiregistration.k8s.io/apiservices/v1.apiextensions.k8s.io
/registry/apiregistration.k8s.io/apiservices/v1.apps
/registry/apiregistration.k8s.io/apiservices/v1.authentication.k8s.io
/registry/apiregistration.k8s.io/apiservices/v1.authorization.k8s.io
/registry/apiregistration.k8s.io/apiservices/v1.autoscaling
/registry/apiregistration.k8s.io/apiservices/v1.batch
/registry/apiregistration.k8s.io/apiservices/v1.certificates.k8s.io
/registry/apiregistration.k8s.io/apiservices/v1.coordination.k8s.io
# etcdctl get /skywell --prefix=true --keys-only --limit 10
/skywell/byd
etcd数据备份
bash
# etcdctl snapshot save etcdbackup.db
{"level":"info","ts":1716971751.0047052,"caller":"snapshot/v3_snapshot.go:68","msg":"created temporary db file","path":"etcdbackup.db.part"}
{"level":"info","ts":1716971751.028518,"logger":"client","caller":"v3/maintenance.go:211","msg":"opened snapshot stream; downloading"}
{"level":"info","ts":1716971751.0286477,"caller":"snapshot/v3_snapshot.go:76","msg":"fetching snapshot","endpoint":"https://[127.0.0.1]:2379"}
{"level":"info","ts":1716971751.3699682,"logger":"client","caller":"v3/maintenance.go:219","msg":"completed snapshot read; closing"}
{"level":"info","ts":1716971751.8124714,"caller":"snapshot/v3_snapshot.go:91","msg":"fetched snapshot","endpoint":"https://[127.0.0.1]:2379","size":"18 MB","took":"now"}
{"level":"info","ts":1716971751.8127532,"caller":"snapshot/v3_snapshot.go:100","msg":"saved","path":"etcdbackup.db"}
Snapshot saved at etcdbackup.db
验证备份数据
bash
# etcdctl --write-out=table snapshot status etcdbackup.db
Deprecated: Use `etcdutl snapshot status` instead.
+----------+----------+------------+------------+
| HASH | REVISION | TOTAL KEYS | TOTAL SIZE |
+----------+----------+------------+------------+
| b91b2b0e | 6454813 | 947 | 18 MB |
+----------+----------+------------+------------+
此时,我们删除测试的nginx的deployment
bash
# kubectl get deployment -A
NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE
default nginx 3/3 3 3 151m
ingress-nginx nginx-deployment 1/1 1 1 15d
ingress-nginx nginx-ingress-controller 1/1 1 1 15d
kube-system coredns 2/2 2 2 63d
kube-system metrics-server 1/1 1 1 56d
kubernetes-dashboard dashboard-metrics-scraper 1/1 1 1 56d
kubernetes-dashboard kubernetes-dashboard 1/1 1 1 56d
# kubectl delete deployment -n default nginx
deployment.apps "nginx" deleted
# kubectl get deployment -A
NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE
ingress-nginx nginx-deployment 1/1 1 1 15d
ingress-nginx nginx-ingress-controller 1/1 1 1 15d
kube-system coredns 2/2 2 2 63d
kube-system metrics-server 1/1 1 1 56d
kubernetes-dashboard dashboard-metrics-scraper 1/1 1 1 56d
kubernetes-dashboard kubernetes-dashboard 1/1 1 1 56d
将备份恢复到集群
将备份的数据还原到 --data-dir 指定的目录
bash
# etcdctl snapshot restore etcdbackup.db --data-dir=/data/foot/etcdtest/restore
Deprecated: Use `etcdutl snapshot restore` instead.
2024-05-29T16:44:33+08:00 info snapshot/v3_snapshot.go:251 restoring snapshot {"path": "etcdbackup.db", "wal-dir": "/data/foot/etcdtest/restore/member/wal", "data-dir": "/data/foot/etcdtest/restore", "snap-dir": "/data/foot/etcdtest/restore/member/snap", "stack": "go.etcd.io/etcd/etcdutl/v3/snapshot.(*v3Manager).Restore\n\t/tmp/etcd-release-3.5.1/etcd/release/etcd/etcdutl/snapshot/v3_snapshot.go:257\ngo.etcd.io/etcd/etcdutl/v3/etcdutl.SnapshotRestoreCommandFunc\n\t/tmp/etcd-release-3.5.1/etcd/release/etcd/etcdutl/etcdutl/snapshot_command.go:147\ngo.etcd.io/etcd/etcdctl/v3/ctlv3/command.snapshotRestoreCommandFunc\n\t/tmp/etcd-release-3.5.1/etcd/release/etcd/etcdctl/ctlv3/command/snapshot_command.go:128\ngithub.com/spf13/cobra.(*Command).execute\n\t/home/remote/sbatsche/.gvm/pkgsets/go1.16.3/global/pkg/mod/github.com/spf13/cobra@v1.1.3/command.go:856\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/home/remote/sbatsche/.gvm/pkgsets/go1.16.3/global/pkg/mod/github.com/spf13/cobra@v1.1.3/command.go:960\ngithub.com/spf13/cobra.(*Command).Execute\n\t/home/remote/sbatsche/.gvm/pkgsets/go1.16.3/global/pkg/mod/github.com/spf13/cobra@v1.1.3/command.go:897\ngo.etcd.io/etcd/etcdctl/v3/ctlv3.Start\n\t/tmp/etcd-release-3.5.1/etcd/release/etcd/etcdctl/ctlv3/ctl.go:107\ngo.etcd.io/etcd/etcdctl/v3/ctlv3.MustStart\n\t/tmp/etcd-release-3.5.1/etcd/release/etcd/etcdctl/ctlv3/ctl.go:111\nmain.main\n\t/tmp/etcd-release-3.5.1/etcd/release/etcd/etcdctl/main.go:59\nruntime.main\n\t/home/remote/sbatsche/.gvm/gos/go1.16.3/src/runtime/proc.go:225"}
2024-05-29T16:44:33+08:00 info membership/store.go:141 Trimming membership information from the backend...
2024-05-29T16:44:34+08:00 info membership/cluster.go:421 added member {"cluster-id": "cdf818194e3a8c32", "local-member-id": "0", "added-peer-id": "8e9e05c52164694d", "added-peer-peer-urls": ["http://localhost:2380"]}
2024-05-29T16:44:34+08:00 info snapshot/v3_snapshot.go:272 restored snapshot {"path": "etcdbackup.db", "wal-dir": "/data/foot/etcdtest/restore/member/wal", "data-dir": "/data/foot/etcdtest/restore", "snap-dir": "/data/foot/etcdtest/restore/member/snap"}
在指定的位置,重新生成了新的etcd数据
bash
# ll restore/member/
total 8
drwx------ 2 root root 4096 May 29 16:44 snap
drwx------ 2 root root 4096 May 29 16:44 wal
# ll /var/lib/etcd/member/
total 0
drwx------ 2 root root 246 May 29 15:08 snap
drwx------ 2 root root 244 May 29 09:14 wal
现在,需要停止所有的 kubernetes 组件以更新 etcd 数据。
将/etc/kubernetes/manifests/kubernetes中的组件清单文件, 将此文件移除。
bash
# ll /etc/kubernetes/manifests/
total 16
-rw------- 1 root root 2260 Mar 26 16:52 etcd.yaml
-rw------- 1 root root 3367 Mar 26 16:52 kube-apiserver.yaml
-rw------- 1 root root 2878 Mar 26 16:52 kube-controller-manager.yaml
-rw------- 1 root root 1464 Mar 26 16:52 kube-scheduler.yaml
# mv /etc/kubernetes/manifetes/* /tmp
kubelet会自动删除pod.【说会自动删除,我试了下,修改了etcd-data的目录,node节点没显示,kubectl get po -A 也没有显示,最后将移除去的yaml再放回来,k8s环境正常】
bash
# kubectl get po -A
NAMESPACE NAME READY STATUS RESTARTS AGE
ingress-nginx nginx-deployment-64d5f7665c-56cpz 1/1 Running 0 15d
ingress-nginx nginx-ingress-controller-7cfc988f46-cszsd 1/1 Running 0 15d
kube-flannel kube-flannel-ds-lpm9c 1/1 Running 0 64d
kube-system coredns-6d8c4cb4d-sml87 1/1 Running 0 64d
kube-system coredns-6d8c4cb4d-w4hgz 1/1 Running 0 64d
kube-system etcd-master 1/1 Running 181 (18d ago) 64d
kube-system kube-apiserver-master 1/1 Running 159 64d
kube-system kube-controller-manager-master 1/1 Running 241 (3d7h ago) 64d
kube-system kube-proxy-6ct9f 1/1 Running 0 64d
kube-system kube-scheduler-master 1/1 Running 3256 (3d7h ago) 64d
kube-system metrics-server-5d6946c85b-5585p 1/1 Running 0 56d
kubernetes-dashboard dashboard-metrics-scraper-6f669b9c9b-hmw4b 1/1 Running 0 19d
kubernetes-dashboard kubernetes-dashboard-57dd8bd998-ghrhd 1/1 Running 26 (18d ago) 19d
当组件都已删除后,修改/etc/kubernetes/manifests/etcd.yaml中的etcd-data中hostPath路径参数
bash
volumeMounts:
- mountPath: /var/lib/etcd #将这里的路径换成新的etcd的路径,刚才restore所在的目录
name: etcd-data
- mountPath: /etc/kubernetes/pki/etcd
name: etcd-certs
hostNetwork: true
priorityClassName: system-node-critical
securityContext:
seccompProfile:
type: RuntimeDefault
volumes:
- hostPath:
path: /etc/kubernetes/pki/etcd
type: DirectoryOrCreate
name: etcd-certs
- hostPath:
path: /var/lib/etcd
type: DirectoryOrCreate
name: etcd-data
status: {}
~
查看k8s集群状态
bash
# kubectl get cs
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME STATUS MESSAGE ERROR
controller-manager Healthy ok
scheduler Healthy ok
etcd-0 Healthy {"health":"true","reason":""}