【PostgreSQL-patroni维护命令】

手册
https://patroni.readthedocs.io/en/latest/index.html

patroni运维

1.列出节点信息

patronictl -c /etc/patroni.yml list

2.重做备库

reinit先是移除了整个data目录。然后选择正确的节点进行备份恢复。

patronictl -c /etc/patroni.yml reinit patnori-test

language 复制代码
# patronictl -c /etc/patroni.yml reinit pgsql
+ Cluster: pgsql (6972099274779350082)+------+---------+----+-----------+
|   Member    |        Host         |  Role  |  State  | TL | Lag in MB |
+-------------+---------------------+--------+---------+----+-----------+
| pgsql_node1 | 192.168.22.128:5432 |        | running | 3  |      0    |
| pgsql_node2 | 192.168.22.129:5432 | Leader | running | 3  |           |
| pgsql_node3 | 192.168.22.130:5432 |        | running | 3  |      0    |
+-------------+---------------------+--------+---------+----+-----------+
Select the following node names to add: pgsql_node3
 Are you sure you want to reinitialize members pgsql_node3?[y/N]: y
 Success: for member pgsql_node3 Perform initialization

参数查看参数

patronictl -c /etc/patroni.yaml show-config

更改参数

patronictl -c /etc/patroni.yaml edit-config

reload加载参数(同时在三个节点生效)

patronictl -c /etc/patroni.yaml reload patnori-test

重启节点/关闭节点

加- -force 强制

加- -scheduled 2023-09-13T18:00-03:00 定时

1.仅重启当前节点(node1节点)

root@pgtest1 \~\]# patronictl restart pg_cluster node1 2.如果节点是 pending 状态的,才会执行重启操作 \[root@pgtest1 \~\]# patronictl restart pg_cluster --pending 3.重启所有成员 \[root@pgtest1 \~\]# patronictl restart pg_cluster ### 维护模式,脱离patroni的集群管理 #### patronictl pause patronictl pause暂时将Patroni集群置于维护模式并禁用自动故障转移。 在某些情况下,Patroni需要暂时退出集群管理,同时仍然在DCS中保留集群状态。可能的用例是集群上不常见的活动,例如主要版本升级或损坏恢复。在这些活动期间,节点经常因为Patroni不知道的原因而启动和停止,有些节点甚至可以暂时提升,这违反了只运行一个主节点的假设。因此,Patroni需要能够与正在运行的集群"分离",在Pacemaker中实现与维护模式相当的功能。 #### patronictl resume patronictl resume将使Patroni集群退出维护模式,并重新启用自动故障转移。 自动拉起所有数据库 ### switchover切换 patronictl switchover ```language # Switchover [postgres@pgtest1 ~]$ patronictl switchover Master [pgtest1]: Candidate ['pgtest2', 'pgtest3'] []: pgtest2 When should the switchover take place (e.g. 2021-10-28T04:45 ) [now]: Current cluster topology ... ... Are you sure you want to switchover cluster pg_cluster, demoting current master pgtest1? [y/N]: y 2021-10-28 03:45:35.91763 Successfully switched over to "pgtest2" ... ... ``` ```language 数据库从 pgtest1 switchover 到 pgtest2 [root@pgtest1 ~]# curl -s http://192.168.58.10:8008/switchover -XPOST -d '{"leader":"pgtest1","candidate":"pgtest2"}' Successfully switched over to "pgtest2" ``` ### failover切换 patronictl failover # Failover [postgres@pgtest1 ~]$ patronictl failover Candidate ['pgtest1', 'pgtest3'] []: pgtest1 Current cluster topology ... ... Are you sure you want to failover cluster pg_cluster, demoting current master pgtest2? [y/N]: y 2021-10-28 03:47:56.13486 Successfully failed over to "pgtest1" ... ... 执行到特定节点的 failover,在节点都正常的情况下,执行 failover 实际上和执行 Switchover 一样 curl -s http://192.168.58.10:8008/failover -XPOST -d '{"candidate":"pgtest2"}' ```language [root@pgtest1 ~]# patronictl list +---------+---------------+---------+---------+----+-----------+ | Member | Host | Role | State | TL | Lag in MB | + Cluster: pg_cluster (7025023477017500881) --+----+-----------+ | pgtest1 | 192.168.58.10 | Leader | running | 5 | | | pgtest2 | 192.168.58.11 | Replica | running | 5 | 0 | | pgtest3 | 192.168.58.12 | Replica | running | 5 | 0 | +---------+---------------+---------+---------+----+-----------+ [root@pgtest1 ~]# curl -s http://192.168.58.10:8008/failover -XPOST -d '{"candidate":"pgtest2"}' ``` ### 使用 patronictl 执行数据库查询操作 ```language [postgres@pgtest1 ~]$ cat aa.sql select * from test_1; [postgres@pgtest1 ~]$ patronictl -c /enmo/app/patroni/pg_test01.yml query -f aa.sql --password Password: id create_time 1 2021-10-16 17:47:34 2 2021-10-16 17:55:06 [postgres@pgtest1 ~]$ patronictl -c /enmo/app/patroni/pg_test01.yml query -c "select * from test_1;" --password Password: id create_time 1 2021-10-16 17:47:34 2 2021-10-16 17:55:06 ``` ### 获取主节点dsn信息 $ ./patronictl -c postgres.yml dsn pgsql host=192.168.1.143 port=5432 ## ETCD 1.查看etcd节点 etcdctl endpoint status --cluster -w table ```language +---------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ | ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS | +---------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ | http://192.168.58.11:2379 | 3f414532c235ce16 | 3.5.1 | 20 kB | false | false | 4 | 16 | 16 | | | http://192.168.58.12:2379 | 41cf8a739c2e9b50 | 3.5.1 | 20 kB | false | false | 4 | 16 | 16 | | | http://192.168.58.10:2379 | caef4208a95efee8 | 3.5.1 | 25 kB | true | false | 4 | 16 | 16 | | +---------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ ``` 2.转移leader节点 etcdctl move-leader 3f414532c235ce16 3.保存数据快照(一个节点执行) etcdctl snapshot save etcd_bak.db 4.查看快照信息 etcdctl snapshot status etcd_bak.db -w table [root@pgtest1 ~]# etcdctl snapshot status etcd_bak.db -w table +----------+----------+------------+------------+ | HASH | REVISION | TOTAL KEYS | TOTAL SIZE | +----------+----------+------------+------------+ | 230fea56 | 0 | 8 | 25 kB | +----------+----------+------------+------------+ ## Zookeeper ### 1.查看zookeeper状态 ./zkServer.sh status ### 2.启动zookeeper ./zkServer.sh start ### 3.停止zookeeper ./zkServer.sh stop ### 4.连接到zookeeper服务 ./zkCli.sh -server localhost:2181 ls(查看当前节点数据), ls2(查看当前节点数据并能看到更新次数等数据) create(创建一个节点) get(得到一个节点,包含数据和更新次数等数据) set(修改节点) delete(删除一个节点) ### 5.查看patroni key值 #### ls ls /service/batman 查看相关key值 [leader, optime, failover, members, initialize, history, config, sync] batman是patroni配置文件中scope的名字。 #### leader(主节点的名字) get /service/batman/leader leader记录主节点的名字,是临时节点,当session时间超过ttl后未响应,zookeeper就会删除该节点。 ```language zk: localhost:2181(CONNECTED) 33] get /service/batman/leader postgresql1 ``` #### sync(同步复制的状态) get /service/batman/sync sync记录同步复制的状态,是持久节点不会因为session到期,删除该key值 ```language [zk: localhost:2181(CONNECTED) 30] get /service/batman/sync {"leader":"postgresql0","sync_standby":"postgresql1"} ``` #### optime(主库最后一次lsn位置) get /service/batman/optime/leader optime/leader是主库最后一次操作后的lsn位置,是持久节点不会因为session到期,删除该key值。 ```language [zk: localhost:2181(CONNECTED) 71] get /service/batman/optime/leader 1342177280 ``` #### failover(记录计划的切换任务) get /service/batman/failover failover是记录计划的切换任务,是持久节点不会因为session到期,删除该key值。 ```language [zk: localhost:2181(CONNECTED) 117] get /service/batman/failover {"leader":"postgresql1","scheduled_at":"2018-11-09T14:30:00+08:00"} ``` #### members(记录所有数据节点的连接信息和重要的状态信息) get /service/batman/members/postgresql0 get /service/batman/members/postgresql members是分别记录了所有数据节点的连接信息和重要的状态信息,是临时节点,当session时间超过ttl后未响应,zookeeper就会删除该节点。 ```language [zk: localhost:2181(CONNECTED) 120] get /service/batman/members/postgresql0 {"conn_url":"postgres://192.168.56.5:5432/postgres","api_url":"http://192.168.56.5:8008 /patroni","timeline":58,"state":"running","role":"replica","xlog_location":1426063952} ``` #### initialize(记录数据库集群初始化的信息) get /service/batman/initialize initialize记录了数据库集群初始化的信息,是持久节点不会因为session到期,删除该key值。 ```language [zk: localhost:2181(CONNECTED) 123] get /service/batman/initialize 6618183861621602635 //这个id是控制文件信息里的 Database system identifier [postgres@node2 ~]$ pg_controldata pg_control version number: 1100 Catalog version number: 201809051 Database system identifier: 6618183861621602635 ``` #### history(记录的是集群中时间线变化的过程) get /service/batman/history history记录的是集群中时间线变化的过程,是持久节点不会因为session到期,删除该key值 ```language [zk: localhost:2181(CONNECTED) 26] get /service/batman/history [[1,67109464,"no recovery target specified"],[2,83886232,"no recovery target specified"],[3,100663448,"no recovery target specified"],[4,218103960,"no recovery target specified"],[5,251658392,"no recovery target specified"],[6,268435608,"no recovery target specified"],[7,318767256,"no recovery target specified"], [8,335544472,"no recovery target specified"],[9,352321688,"no recovery target specified"],[10,369098904,"no recovery target specified"],[11,402653336,"no recovery target specified"],[12,419430552,"no recovery target specified"],[13,436207768,"no recovery target specified"],[14,452984984,"no recovery target specified"], [15,469762200,"no recovery target specified"],[16,486539416,"no recovery target specified"],[17,503316632,"no recovery target specified"],[18,536871064,"no recovery target specified"],[19,553648280,"no recovery target specified"],[20,570425496,"no recovery target specified"],[21,603979928,"no recovery target specified"], [22,620757144,"no recovery target specified"],[23,637534360,"no recovery target specified"],[24,654311576,"no recovery target specified"],[25,671088792,"no recovery target specified","2018-11-05T16:44:52+08:00"],[26,704643224,"no recovery target specified","2018-11-05T16:49:12+08:00"],[27,721420440,"no recovery target specified","2018-11-05T23:21:12+08:00"],[28,771752088,"no recovery target specified","2018-11-06T04:32:40+08:00"],[29,805306520,"no recovery target specified","2018-11-06T05:22:45+08:00"],[30,822083736,"no recovery target ``` #### config(记录的是patroni配置文件的配置信息) get /service/batman/config config记录patroni配置文件的配置信息,是持久节点不会因为session到期,删除该key值 ```language [zk: localhost:2181(CONNECTED) 28] get /service/batman/config {"retry_timeout": 10, "postgresql": {"use_slots": true, "use_pg_rewind": true, "parameters": {"hot_standby": "on", "wal_keep_segments": 8, "wal_level": "hot_standby", "archive_command": "mkdir -p ../wal_archive && test ! -f ../wal_archive/%f && cp %p ../wal_archive/%f", "wal_log_hints": "on", "max_wal_senders": 10, "archive_timeout": 2000, "archive_mode": "on", "max_replication_slots": 10, "max_connections": 300}, "recovery_conf": {"restore_command": "cp ../wal_archive/%f %p"}}, "synchronous_mode": true, "maximum_lag_on_failover": 1048576, "loop_wait": 10, "max_connection": 300, "archive_timeout": "2000s", "ttl": 30, "max_connections": 300} ``` ### 6.zk清理快照 保留最近20个snap文件 ./zkCleanup.sh -n 20 使用一些linux命令删除n天前的数据或日志 find /zookeeperData/version-2/ -name "snap*" -mtime +10 | xargs rm -f find /zookeeperDataLog/version-2/ -name "log*" -mtime +10 | xargs rm -f find /opt/apache-zookeeper-3.7.1-bin/logs -name "zookeeper.log.*" -mtime +10 | xargs rm --f **配置自动清理日志** 从3.4.0开始,会自动清理日志了,所以这个通常不用配置。 配置autopurge.snapRetainCount和autopurge.purgeInterval参数。 保留的snapshop的数量,默认是3个,最小也是3。 ```language autopurge.snapRetainCount=3 autopurge.purgeInterval=1 ``` 3.4.0之前的版本可以通过zookeeper的配置自行对snap进行管理。如下这三个参数分别表示一个小时清理一次,log的大小(单位是kb)和快照的数量。 autopurge.purgeInterval=1 preAllocSize=131072 snapCount=300000 ### 7.通过下面命令查看zookeeper启动的各个参数,包括java路径等 ./bin/zkServer.sh print-cmd ## 参考链接 1.https://www.modb.pro/db/73762 2.https://www.modb.pro/db/75268 3.https://www.modb.pro/topic/152353 升级 1.https://www.modb.pro/db/500077

相关推荐
Three~stone21 分钟前
MySQL学习集--DDL
数据库·sql·学习
Qi妙代码23 分钟前
MYSQL基础
数据库·mysql·oracle
Arbori_2621525 分钟前
Oracle中的UNION原理
数据库·oracle
llzcxdb29 分钟前
【MySQL】理解MySQL的双重缓冲机制:Buffer Pool与Redo Log的协同之道
数据库·mysql
SelectDB技术团队1 小时前
Apache Doris 2025 Roadmap:构建 GenAI 时代实时高效统一的数据底座
大数据·数据库·数据仓库·人工智能·ai·数据分析·湖仓一体
冷凝女子1 小时前
【QT】获取文件路径中的文件名,去掉后缀,然后提取文件名中的数字
开发语言·数据库·qt
Y第五个季节1 小时前
Redis - HyperLogLog
数据库·redis·缓存
Allen Bright1 小时前
【MySQL基础-20】MySQL条件函数全面解析:提升查询逻辑的利器
数据库·mysql
Justice link2 小时前
企业级NoSql数据库Redis集群
数据库·redis·缓存
爱的叹息2 小时前
主流数据库的存储引擎/存储机制的详细对比分析,涵盖关系型数据库、NoSQL数据库和分布式数据库
数据库·分布式·nosql